README file for rgadb of RiceGAAS [2002.11.21] ----------------------------------------------------------------------------- RiceGAAS (Rice Genome Automated Annotation System) collects the rice genome entries in GenBank. Please see the following URLs about RiceGAAS. http://RiceGAAS.dna.affrc.go.jp/ http://RiceGAAS.dna.affrc.go.jp/rgadb/ http://RiceGAAS.dna.affrc.go.jp/RiceGAAS_system.html RiceGAAS collects the rice genome entries from the daily-update in GenBank database. The collection rules are as follows; SOURCE = "oryza sativa" and 9999bp < [sequence size] < 1Mbp not DIVISION= "GSS" or "STS" Each entry from GenBank is categorized into the anchored chromosome, or "unknown" if it dose not have any chromosomal information. Two data set as shown below are made every week. GenBank files FASTA files - FTP files ftp://ftp.dna.affrc.go.jp/pub/RiceGAAS/ +-- 20020918 .. data directory (made at 2002-09-18) +-- yyyymmdd .. data directory (made at yyyy-mm-dd) +-- README .. this document +-- current .. link to the latest data directory Data directories in the ftp file above have GenBank files and FASTA files containing the following files below. - GenBank files contain the following files. RiceGAAS.GenBank.chr01.tar.gz .. chromosome 1 genbank files RiceGAAS.GenBank.chr02.tar.gz .. chromosome 2 genbank files RiceGAAS.GenBank.chr03.tar.gz .. chromosome 3 genbank files RiceGAAS.GenBank.chr04.tar.gz .. chromosome 4 genbank files RiceGAAS.GenBank.chr05.tar.gz .. chromosome 5 genbank files RiceGAAS.GenBank.chr06.tar.gz .. chromosome 6 genbank files RiceGAAS.GenBank.chr07.tar.gz .. chromosome 7 genbank files RiceGAAS.GenBank.chr08.tar.gz .. chromosome 8 genbank files RiceGAAS.GenBank.chr09.tar.gz .. chromosome 9 genbank files RiceGAAS.GenBank.chr10.tar.gz .. chromosome 10 genbank files RiceGAAS.GenBank.chr11.tar.gz .. chromosome 11 genbank files RiceGAAS.GenBank.chr12.tar.gz .. chromosome 12 genbank files RiceGAAS.GenBank.unknown.tar.gz .. unknown genbank files % gunzip RiceGAAS.GenBank.chr01.tar.gz % tar tvf RiceGAAS.GenBank.chr01.tar|more drwxr-xr-x 100/100 0 2002-04-25 19:51 chr01/ -rw-r--r-- 100/100 145408 2000-03-01 22:40 chr01/10A19I -rw-r--r-- 100/100 195764 2002-03-26 01:33 chr01/B1085F09 -rw-r--r-- 100/100 193497 2002-03-26 01:34 chr01/OJ1174_D05 -rw-r--r-- 100/100 201668 2002-03-26 01:35 chr01/OSJNBa0025P13 -rw-r--r-- 100/100 209261 2002-03-26 01:25 chr01/OSJNBa0004B13 -rw-r--r-- 100/100 212063 2002-03-26 01:29 chr01/P0666G04 :: - FASTA files contain only the sequence data picked up from GenBank files, with comment lines as follows; >[Accession No.] ([clone name]) [chromosome No.] [location(cM)] [size(bp)] The "location" means the position on the genetic map by RGP shown in the web pages. http://rgp.dna.affrc.go.jp/cgi-bin/statusdb/statassign.pl FATSA files; RiceGAAS.chr01.fasta.gz .. chromosome 1 fasta file RiceGAAS.chr02.fasta.gz .. chromosome 2 fasta file RiceGAAS.chr03.fasta.gz .. chromosome 3 fasta file RiceGAAS.chr04.fasta.gz .. chromosome 4 fasta file RiceGAAS.chr05.fasta.gz .. chromosome 5 fasta file RiceGAAS.chr06.fasta.gz .. chromosome 6 fasta file RiceGAAS.chr07.fasta.gz .. chromosome 7 fasta file RiceGAAS.chr08.fasta.gz .. chromosome 8 fasta file RiceGAAS.chr09.fasta.gz .. chromosome 9 fasta file RiceGAAS.chr10.fasta.gz .. chromosome 10 fasta file RiceGAAS.chr11.fasta.gz .. chromosome 11 fasta file RiceGAAS.chr12.fasta.gz .. chromosome 12 fasta file RiceGAAS.unknown.fasta.gz .. unknown fasta file % gunzip RiceGAAS.chr01.fasta.gz % grep "^>" RiceGAAS.chr01.fasta |more >10A19I (10A19I) chr01 99587bp >AP003103 (B1085F09) chr01 52.7cM 132713bp >AP003118 (OJ1174_D05) chr01 20.2cM 128525bp >AP003140 (OSJNBa0025P13) chr01 58.1cM 133242bp >AP003018 (OSJNBa0004B13) chr01 142268bp >AP003047 (P0666G04) chr01 22.6cM 141983bp >AP003074 (OSJNBa0004G10) chr01 42.4-43.2cM 150379bp >AP003104 (OSJNBa0038J17) chr01 30.5cM 180186bp :: If you have any question, please send e-mail to "support@dna.affrc.go.jp". .