LOCUS       KJ755549                7913 bp    DNA     linear   BCT 29-MAR-2016
DEFINITION  Escherichia coli strain C 771 serotype O142 O-antigen gene cluster,
            complete sequence.
ACCESSION   KJ755549
VERSION     KJ755549.1
KEYWORDS    .
SOURCE      Escherichia coli
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacterales;
            Enterobacteriaceae; Escherichia.
REFERENCE   1  (bases 1 to 7913)
  AUTHORS   DebRoy,C., Fratamico,P.M., Yan,X., Baranzoni,G., Liu,Y.,
            Needleman,D.S., Tebbs,R., O'Connell,C.D., Allred,A., Swimley,M.,
            Mwangi,M., Kapur,V., Raygoza Garay,J.A., Roberts,E.L. and Katani,R.
  TITLE     Comparison of O-Antigen Gene Clusters of All O-Serogroups of
            Escherichia coli and Proposal for Adopting a New Nomenclature for
            O-Typing
  JOURNAL   PLoS ONE 11 (1), E0147434 (2016)
   PUBMED   26824864
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 7913)
  AUTHORS   Yan,X., Fratamico,P.M., Tebbs,R.S., O'Connell,C.D., Baranzoni,G.M.,
            Liu,Y. and Debroy,C.
  TITLE     Direct Submission
  JOURNAL   Submitted (25-APR-2014) Molecular Characterization of Foodborne
            Pathogens Research Unit, USDA-ARS, 600 East Mermaid Lane, Wyndmoor,
            PA 19038, USA
COMMENT     ##Assembly-Data-START##
            Assembly Method       :: CLC Genomics Workbench v. 7.0
            Coverage              :: Average 50X
            Sequencing Technology :: IonTorrent
            ##Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..7913
                     /organism="Escherichia coli"
                     /mol_type="genomic DNA"
                     /strain="C 771"
                     /serotype="O142"
                     /db_xref="taxon:562"
     misc_feature    1..7913
                     /note="O-antigen gene cluster"
     gene            <1..659
                     /gene="rmlA"
     CDS             <1..659
                     /gene="rmlA"
                     /codon_start=3
                     /transl_table=11
                     /product="glucose-1-phosphate thymidylyltransferase"
                     /protein_id="AIG62441.1"
                     /translation="WGLNLQYQVQPRPDGLAQAFIIGEEFIGKDDCALVLGDNIFYGH
                     DLPKLTDIAVNKKSGATVFAYHVNDPERYGVIEFDKDGTAISLEEKPLVPKSNYAVTG
                     LYFYDNSVIEMAKNLKPSVRGELEITDINRIYMEQGKLCRMMGRGYAWLDTGTHQSLI
                     EASNFIATIEERQGLKVSCPEEIAFRKGFIDAEQVKKLAAPLSKNAYGQYLLKMINGD
                     "
     gene            649..1221
                     /gene="rmlC"
     CDS             649..1221
                     /gene="rmlC"
                     /codon_start=1
                     /transl_table=11
                     /product="dTDP-4-dehydrorhamnose 3,5-epimerase"
                     /protein_id="AIG62445.1"
                     /translation="MVINKMNVIKTEIPDVLIFEPKVFCDARGFFFESFNLKIFAEAV
                     GRNVEFVQDNHSKSKKGVLRGLHYQVAPFAQGKLVRCIAGEVYDVAVDLRKSSPTFAK
                     WVGVNLSAKNKRQLWIPEGFAHGFMVLSDEAEFVYKTTNYYSPKSERSINYADSQINI
                     KWPSSFNLKLSRKDEIAPQLAIILENELFE"
     gene            1218..2462
                     /gene="wzx"
     CDS             1218..2462
                     /gene="wzx"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen flippase"
                     /protein_id="AIG62446.1"
                     /translation="MSIIKNSLWNVVGYIVPAIVTIPALGILGRILGAETFGVFTLAL
                     AIVGYASIFDVGLSRAVIREIALFRDDQEEKRRIIFTASLLVTVMGVTAALVLYIASD
                     VIANLLKISSELHLSVVNSLHILSLSIPVYLVTQIWLAILEGEEKFGLLNIYKSITGS
                     LISLLPVICIFISPSIEYAIIGLVVSRLVCMLFAFFLCKRIIVESYFEFSKLTLKRML
                     MFGGWITVSNIISPLMAYFDRFIVSNQLGAAVVAFYTAPSEIIARLGIVPGAFARAIF
                     PRLSCSNDVHDRKKNKKIVSLLLFLITVPVFIVGLLASNKFMVLWMGPEFAGTSANIL
                     VILLLGFVFNSLAQVPFASIQSRGYAKITAYIHMVELIPYLMALFYFINNYGIIGAAY
                     AWSIRVTIDYILLAFFDRCFDK"
     gene            2514..3695
                     /gene="wzy"
     CDS             2514..3695
                     /gene="wzy"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen polymerase"
                     /protein_id="AIG62447.1"
                     /translation="MLYVLTFALITSFGLFIALYLVKCNFTSPLSLHCFAWFFVSSTG
                     LFAYDEFIDFPEISFYAVMIWYLIVYFILITGELISLNIKSVNYFKNKEYICGRYWII
                     VIPLSAYTIYEIYRVGNTGPASFFLNLRLANIIDDYEGEKFTLMTAIYPVLIAMFSIV
                     CISCSSKKNKYALWLWSILFCIGTMGKFAVITPILIFYIIRELTGGLNKKRMVFIVPG
                     VISAILFMHFIRMSSGDSTTISSVLGVYIYSPLLALSKLPELNINGESGEYTFRFLIA
                     ILYKVGLSSNEPVKTILDYVNVPVPTNVFTVMQPFYQDYSLFGVAFGAIFYGIIYSSI
                     YLLAKKGNPVALLIYAVLAISLFTSFFAETLITNLAGNIKVVICIYLLWRFTVRCKIK
                     P"
     CDS             3659..4612
                     /codon_start=1
                     /transl_table=11
                     /product="rhamnosyl transferase"
                     /protein_id="AIG62448.1"
                     /translation="MEIYSKMQDKTVTILMATYNGSAFIENQILSLQQQKYKDWILYI
                     HDDGSSDDTLDIIKRIQLTEPRINLIEDGLTRLGAGKNFLSLVKYSATNYTIFCDQDD
                     IWLENKLSEMIVFADSKGLASSKLPSMIYADGYAFDDSTGEIDFCGISHNHATRLKDF
                     LFFNAGYQGCSILFNKAMVDIAANYHGYVHLHDDVVSLIAHSLGNVYFLPKKLMLYRQ
                     HLGAVTGQKKFNNRFISMLTSKVNYLLSREHFLVKRSFYDNYHHLLTAEIKNDFEVFF
                     KFCQTKNKLSQLVLLLKHDFRLNNSRLKLLLKCIVRRTFSQ"
     CDS             4609..5397
                     /codon_start=1
                     /transl_table=11
                     /product="PGL/p-HBAD biosynthesis
                     glycosyltransferase/MT3031"
                     /protein_id="AIG62449.1"
                     /translation="MISVLTPTYNRAYTLKRLYESLICQTTKSFEWVVVNDGSNDETE
                     SIIKEFQQQNIIKIIYYKQENKGKTQALNAGIQLCAGSDILIVDSDDLLTSDAIACIE
                     ASLTQEKIYNKKISGVAFRKAYLDETIIGTVFDDSVDSFCYLSATDAGHLFKGDLAYC
                     FKKEMLQMFPFPYFYNEKFVPELYIWNKITDHALVKFHKKKAVYLCEYLEDGLSKNFK
                     TQLLKNPKGFSIYYIDQFKRETNYIRKLKMLIRYFQCKLYELKK"
     gene            5394..6488
                     /gene="pglJ"
     CDS             5394..6488
                     /gene="pglJ"
                     /codon_start=1
                     /transl_table=11
                     /product="N-acetylgalactosamine-N,
                     N'-diacetylbacillosaminyl-diphospho-undecaprenol
                     4-alpha-N-acetylgalactosaminyltransferase"
                     /protein_id="AIG62450.1"
                     /translation="MNIVFVITGLGLGGAEKQVCLLADRLAEAEHQISIVSLTGECVV
                     KPDNKHIEIYHLNMDKSLISFFLGVIKLRNIISTIKPDIVHSHMFHANIMSRLSKLLL
                     PYSHKLICTAHSRYEGGRLRMLCYRLTDYFSDINTNVSKEALDEYVNNKYFSATKSIV
                     VYNGVDTEKFNFSIDNRIYIREQLSINNHDRLILSVGRLNPAKDYPNLLAAFMLLPEH
                     YKLVIIGEGDVRSQIEQIIKDHKLGTRVQLLGSVNNVNDYYSACDLFVLSSAWEGFSL
                     VVIEAMACQRIAVCTDAGGVKEAFTDRHYIVPTSNAAALARKIIEVDMLTIDRKNEIQ
                     NDNRNNVVNKFSINAIVNHWLTIYKNITMQ"
     gene            6732..7751
                     /gene="galE"
     CDS             6732..7751
                     /gene="galE"
                     /codon_start=1
                     /transl_table=11
                     /product="UDP-glucose 4-epimerase"
                     /protein_id="AIG62451.1"
                     /translation="MTILVTGGAGYIGSHTVVRLLEKGKEIVILDDFTNSFPETLNRI
                     KIITGVKPFFYEGSVLDRNLLKKIFVENNITDVIHFAGLKSVGESVSSPLKYYEVNIA
                     GSLHLVEEMITHNISNFIFSSSATVYGEPETIPLTESSRIGGTTNPYGTSKLMVEKIL
                     EDVTRSNPEFRTTILRYFNPVGAHPSGDMGEDPNGIPNNLMPYICQVAIGKYKQVSVY
                     GSDYPTKDGTGVRDFIHVMDLAEGHVAALEHRNKGPNHKVYNLGTGTGYSVLELLTAF
                     ERVTSRKVPYVLSERRPGDIAECWSNPSKAYAELGWKAKRGLEDMVRDAWNWQQKNPN
                     GYKKE"
ORIGIN      
        1 aatggggatt gaatcttcaa tatcaagtac agccaaggcc ggatggtttg gcgcaagcat
       61 ttattattgg tgaagagttc attggtaaag atgattgtgc tctagtattg ggcgataata
      121 ttttctatgg acatgattta ccaaaactca ctgatatcgc agttaataaa aaaagtggtg
      181 caactgtttt tgcatatcac gttaatgatc cggaacgtta tggcgttatt gagtttgata
      241 aagatggtac agcgatatct ttagaagaaa aaccgctagt gccaaaaagt aattatgctg
      301 taactgggct ttatttctat gacaacagtg tgattgaaat ggctaaaaat ctcaagcctt
      361 ccgtgcgtgg agagctggaa atcaccgata tcaaccgtat ttatatggag cagggaaaac
      421 tttgtcgcat gatggggagg ggctatgctt ggctggatac gggaacacat caaagcctga
      481 tagaagcgag caactttatt gctactattg aagagcgtca gggattgaaa gtttcttgtc
      541 cggaagagat agcgtttcgc aagggtttta ttgatgctga gcaggttaaa aaactagcag
      601 caccgttatc aaagaatgcc tatggacaat atctccttaa gatgattaat ggtgattaat
      661 aaaatgaatg taattaaaac agaaatccct gatgtactca tttttgagcc aaaagttttt
      721 tgcgatgctc gtggtttttt ctttgagagt tttaatctga aaatatttgc agaggccgtt
      781 ggtagaaacg ttgaatttgt tcaggacaat cattcaaaat ccaaaaaagg ggttttacga
      841 ggtcttcact atcaggtagc cccgtttgct cagggaaaat tagtgagatg cattgctggt
      901 gaagtatatg atgtcgctgt tgatctacgt aaatcatctc cgacctttgc taaatgggtt
      961 ggtgtaaatc tttcggctaa gaataaacgc caattgtgga ttcctgaagg attcgcgcat
     1021 ggttttatgg tgttgagtga tgaagctgaa tttgtataca aaactacgaa ttattatagc
     1081 cccaaatcag aacgttcaat aaactatgcc gactcacaga taaatattaa gtggccgtca
     1141 agttttaatt tgaaactttc acggaaagac gaaattgcac ctcagttggc tattatttta
     1201 gaaaatgagc tttttgaatg agcattataa aaaacagtct ttggaatgtt gtcgggtata
     1261 ttgtccctgc tattgtcacg atccctgcgt tgggaatatt aggtcgaatc ttaggtgcag
     1321 aaaccttcgg tgtatttact cttgcgctag caatagttgg gtacgctagt atttttgatg
     1381 taggcttatc aagagctgtt attcgagaaa tagcgttatt tcgagatgat caagaagaaa
     1441 aaagaagaat aatattcaca gcatcattat tagttacagt aatgggagtt actgctgcat
     1501 tagtattgta tattgcaagt gatgtaatag ctaatttgtt gaagatcagc tctgagttac
     1561 atttgagtgt tgttaattct ctacatatac tctcactctc catccccgtt tatttggtaa
     1621 cacaaatatg gcttgcaatt ctagagggag aagagaagtt tggtctatta aatatttata
     1681 aatctattac tggttctcta atttcactac ttccggttat ttgtatcttt atttctccct
     1741 ccattgagta tgctattatt ggcttggttg tatctcgatt ggtttgtatg ttgtttgctt
     1801 tctttttatg taaaagaata atagtggagt catatttcga gtttagcaaa ctaacattga
     1861 aacgaatgct aatgtttggg ggatggataa cagtaagcaa tattataagt ccgttaatgg
     1921 cctattttga ccgatttatt gtttctaatc aattaggtgc agcagtcgtt gccttttata
     1981 cagctccttc tgaaattatt gcaaggttag gtattgttcc cggtgctttt gcccgcgcta
     2041 tttttcctcg tttgagttgt tcaaatgatg tccacgatag aaaaaagaac aaaaaaatag
     2101 tatcattact tcttttcctg ataacagttc cagtatttat tgtaggactt ttggccagta
     2161 ataaatttat ggttttgtgg atgggacctg aatttgcagg tacttcagct aatatattgg
     2221 ttattcttct tctcggtttt gttttcaatt cattagcaca ggtccctttt gctagtatac
     2281 agtcacgtgg ttatgctaaa atcactgcat atatacatat ggtggagcta atcccttatt
     2341 tgatggctct gttttatttt attaataact atggaattat tggtgccgcg tatgcatgga
     2401 gtataagagt gactattgat tatatattat tggcattttt tgacagatgc tttgataagt
     2461 agcataagaa aattttaata atcgctatca aagaacggag gaaatgacca tttatgttat
     2521 atgttctcac ttttgcgctc attacatcat tcgggctttt tattgcactt tatctggtca
     2581 aatgtaactt tacatcacct ctgtctttac attgtttcgc atggtttttc gttagtagca
     2641 ctggattgtt tgcttatgat gagtttattg attttcccga gataagcttc tatgcagtaa
     2701 tgatttggta cctgattgta tattttattt taataacagg tgagttaata tccttaaata
     2761 taaaaagtgt gaactatttt aaaaataagg aatatatatg tggaagatat tggatcatag
     2821 taataccatt gtctgcttat acaatttacg aaatttatag agtgggaaat actgggcctg
     2881 catcattttt ccttaattta cgtctcgcaa acattattga tgattacgaa ggtgaaaagt
     2941 tcacattgat gacggctata tatccagtct tgatagcgat gttttccatc gtatgcatat
     3001 cttgttcatc aaagaagaat aagtacgcat tatggctatg gtctattcta ttttgtattg
     3061 gcactatggg taaatttgct gtaataaccc caattttaat tttctatata atccgtgagt
     3121 tgacgggagg cttaaataaa aagaggatgg tttttattgt tccaggtgta atctctgcta
     3181 ttctttttat gcatttcata cgaatgtcta gtggagatag tactacaata agttctgttc
     3241 taggggtata tatttactcg ccactacttg cattgagtaa gttacctgaa ctaaatataa
     3301 atggtgaatc tggtgaatat acattcagat ttttgattgc aattctatat aaagtaggcc
     3361 tgtcctcaaa tgagccggtt aaaactattt tagattacgt caacgtgccc gttccaacaa
     3421 atgtttttac ggtgatgcaa ccattctacc aagattattc tttatttgga gtagcatttg
     3481 gagcaatatt ttatgggatt atatattctt caatatatct tctggcaaaa aaaggaaatc
     3541 cagttgccct gcttatttat gccgttcttg caattagcct atttacttct ttttttgcag
     3601 aaactttaat aactaacctt gcaggtaata taaaggtagt aatttgcata taccttctat
     3661 ggagatttac agtaagatgc aagataaaac cgtaacgata ttaatggcaa cttataatgg
     3721 aagtgctttt atagagaacc agattctttc attacagcaa caaaaatata aagattggat
     3781 actctatatt catgatgatg gttcctctga tgatacctta gatattataa agagaattca
     3841 gcttacagaa cctcgtatta atcttattga ggatgggctt acgagacttg gtgcagggaa
     3901 aaattttctt tcattagtaa aatattcagc tacaaattat actatatttt gcgatcaaga
     3961 tgatatttgg ctcgaaaaca aattaagcga gatgattgtt ttcgcagata gtaaagggct
     4021 ggctagtagt aagttaccat ctatgattta tgcagatggg tatgcatttg acgatagcac
     4081 aggtgagatt gatttttgtg gtatatcaca taatcatgct acgagattga aagatttctt
     4141 atttttcaat gctggctatc aagggtgttc aatattattt aataaagcaa tggttgacat
     4201 tgcagcaaat taccatggtt atgtccattt acatgatgat gtcgtaagtc tgatcgccca
     4261 ttctttgggg aatgtgtatt ttttaccaaa aaaattgatg ttatatcgcc aacacttagg
     4321 ggcagtaact ggccaaaaga agtttaataa cagattcatt tcaatgctga catccaaagt
     4381 taattatctt ttgtcaagag agcatttttt agttaagaga tctttttacg ataattacca
     4441 tcatctttta acggctgaga ttaaaaatga ctttgaagtc ttttttaaat tttgccaaac
     4501 aaaaaacaaa ctttcacagt tagtgctttt attgaagcat gattttcgct taaataacag
     4561 cagattaaag ctattgttaa aatgcatagt aagaaggaca tttagtcaat gatctccgtc
     4621 ttaacaccta cttataatcg tgcgtatacc ttgaagcgat tatacgagtc gttaatttgt
     4681 caaacgacta aatcctttga atgggttgtg gttaatgatg gtagtaacga tgagactgag
     4741 tcaataataa aagagtttca gcaacaaaat attattaaaa ttatttatta caaacaggaa
     4801 aataaaggta aaactcaggc tcttaatgca ggtatacaat tatgtgcagg aagtgatatt
     4861 ttaatcgttg atagcgatga tctattgaca tcggatgcta tagcttgcat tgaagctagt
     4921 ttgactcagg agaagatata taataaaaaa atatcggggg tggcatttcg taaagcatat
     4981 ctggatgaaa ccataattgg tactgttttt gatgatagtg ttgatagttt ttgttatctt
     5041 tctgcaactg acgctgggca tcttttcaaa ggcgatcttg catattgctt taagaaagaa
     5101 atgctgcaaa tgtttccttt tccatacttt tataacgaaa agtttgttcc tgaattatac
     5161 atttggaaca agattacaga tcacgcatta gtaaaatttc ataaaaaaaa agcagtatac
     5221 ctatgtgagt atctcgaaga tgggttatct aaaaacttca aaacacaatt attaaaaaat
     5281 cccaaagggt ttagtatcta ttatattgat cagtttaagc gtgaaaccaa ctacattcga
     5341 aaattaaaaa tgttgattcg ttattttcag tgtaaattat atgagctaaa aaaatgaata
     5401 tagtgtttgt gattactgga ttgggacttg ggggggctga aaagcaagta tgcttattgg
     5461 ctgatcgact ggctgaggca gagcatcaaa tatcgatagt gtcccttaca ggagaatgtg
     5521 ttgttaagcc agataataag catattgaaa tttaccacct taatatggat aaaagcctaa
     5581 taagtttctt tttaggtgtt ataaaattac gtaatattat ttctactatt aagcctgata
     5641 ttgtgcatag ccatatgttc catgcaaata tcatgtctag attaagtaag ttactattgc
     5701 cttattccca taagctcata tgtacagcac atagcagata tgagggaggg aggcttcgta
     5761 tgctttgtta tcgtttaaca gattatttca gtgatataaa tacaaacgtt agtaaagagg
     5821 cgctcgatga atatgttaat aataaatatt tctcagcgac taaatctatt gttgtttata
     5881 atggcgttga tacagaaaaa tttaatttca gtattgataa tcgtatttat ataagagagc
     5941 aattgtctat taataaccat gataggttga ttctcagtgt tggcagatta aaccctgcta
     6001 aagattatcc taatctgcta gcagcattca tgcttttacc tgagcactat aagcttgtta
     6061 tcattggtga gggagacgtt cgttcacaga ttgaacagat aattaaagat cacaagttag
     6121 ggacgcgtgt tcagttactt ggtagtgtaa ataatgtcaa tgattattat tctgcttgtg
     6181 atttatttgt cttatcttcg gcttgggaag gttttagctt agttgttatc gaggccatgg
     6241 catgtcaacg catcgctgta tgtactgatg caggtggagt aaaagaagct tttactgatc
     6301 gtcattacat tgtaccaact tcaaatgccg cagctttagc tcgcaaaatt attgaagttg
     6361 atatgttaac tattgatcgt aaaaatgaaa tacaaaatga taataggaat aatgttgtta
     6421 acaagttttc aattaatgct attgttaatc attggttaac aatctataaa aatataacta
     6481 tgcagtagat aaatctatta cataatgatg aattaaaaac aatattaatt tttagctgtt
     6541 ttttataata cctagttcaa tatggattaa attattttgt tttttaattt gtaatctgga
     6601 ataggttatt atatttacta gattgaatgt agttcaatgg atagattatc acgttaacat
     6661 ttaattgaag tataagttgt gtggcgttgt gttgctttaa attcttaata aataattaag
     6721 gttggtgatt aatgactatt ttagtaacgg gcggtgctgg ctatataggt tcgcatacag
     6781 tggtgagact gcttgaaaag ggtaaagaaa ttgttattct tgatgatttt actaattcat
     6841 ttcctgaaac attgaataga ataaaaataa ttaccggtgt taaacctttt ttttacgaag
     6901 gttctgtcct tgataggaat ttattgaaaa aaatctttgt cgaaaataac atcaccgatg
     6961 ttattcattt tgcagggctc aaatcagttg gtgaatccgt atcgtcccct cttaagtatt
     7021 atgaagtcaa tatagcagga agtttgcatc tagttgaaga gatgattacg cacaatataa
     7081 gcaattttat ttttagctct tctgcaacag tatatggtga accagaaact attccattga
     7141 cggagtcctc tcgcattggt ggcactacaa acccttatgg tacatctaag cttatggttg
     7201 agaaaatact tgaggatgtt actcgttcta atcctgagtt tagaaccaca attttacgat
     7261 attttaatcc cgttggtgct catccttctg gtgatatggg cgaagatccg aatggtattc
     7321 caaataatct catgccttat atctgtcagg ttgctattgg taagtataag caagtttcag
     7381 tatacggaag cgattatcca acaaaagatg gtacaggtgt tcgtgatttc attcatgtca
     7441 tggatcttgc tgaagggcat gttgcagcgt tagaacacag aaataaggga ccaaatcata
     7501 aagtttacaa cttgggcaca ggcactggtt attctgtttt ggaactcctg acagcttttg
     7561 aaagagtaac ttctcgtaaa gtaccttacg ttttaagtga aagacgccct ggagatatcg
     7621 ccgaatgttg gtctaatcct tcgaaggcgt atgcggaact tggatggaaa gcgaagcgcg
     7681 gactggaaga catggttcga gatgcctgga attggcaaca aaagaatcca aacggttata
     7741 aaaaagaatg aatgatcaaa gaaatttttg tcaactgcaa aaattaccga atttatgtat
     7801 cctgagttaa catagcacta acattgaatt gcgttatgtt tcccagtatc acctctgaca
     7861 ggagtaaaca atgtcaaagc aacagatcgg cgtcgtcggt atggcagtga tgg
//