LOCUS       KJ755564                7842 bp    DNA     linear   BCT 29-MAR-2016
DEFINITION  Escherichia coli strain E 40 serotype O33:H- O-antigen gene
            cluster, complete sequence.
ACCESSION   KJ755564
VERSION     KJ755564.1
KEYWORDS    .
SOURCE      Escherichia coli
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacterales;
            Enterobacteriaceae; Escherichia.
REFERENCE   1  (bases 1 to 7842)
  AUTHORS   DebRoy,C., Fratamico,P.M., Yan,X., Baranzoni,G., Liu,Y.,
            Needleman,D.S., Tebbs,R., O'Connell,C.D., Allred,A., Swimley,M.,
            Mwangi,M., Kapur,V., Raygoza Garay,J.A., Roberts,E.L. and Katani,R.
  TITLE     Comparison of O-Antigen Gene Clusters of All O-Serogroups of
            Escherichia coli and Proposal for Adopting a New Nomenclature for
            O-Typing
  JOURNAL   PLoS ONE 11 (1), E0147434 (2016)
   PUBMED   26824864
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 7842)
  AUTHORS   Yan,X., Fratamico,P.M., Tebbs,R.S., O'Connell,C.D., Baranzoni,G.M.,
            Liu,Y. and Debroy,C.
  TITLE     Direct Submission
  JOURNAL   Submitted (26-APR-2014) Molecular Characterization of Foodborne
            Pathogens Research Unit, USDA-ARS, 600 East Mermaid Lane, Wyndmoor,
            PA 19038, USA
COMMENT     ##Assembly-Data-START##
            Assembly Method       :: CLC Genomics Workbench v. 7.0
            Coverage              :: >50X
            Sequencing Technology :: IonTorrent
            ##Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..7842
                     /organism="Escherichia coli"
                     /mol_type="genomic DNA"
                     /strain="E 40"
                     /serotype="O33:H-"
                     /db_xref="taxon:562"
     misc_feature    1..7842
                     /note="O-antigen gene cluster"
     gene            56..1474
                     /gene="yutF"
     CDS             56..1474
                     /gene="yutF"
                     /codon_start=1
                     /transl_table=11
                     /product="hydrolase"
                     /protein_id="AIG62595.1"
                     /translation="MIIAKGDKMIGIILAAGVGSRLRPMTNSKPKCLVTVAGKPILDY
                     QLNSYRLAGIKDIFIIVGYEGDKIKKHCKYIKDLNITIIENNEYEDTNNMYSLFLAKK
                     YAYGESFILNNADLAIDSNIIEKICESPFSDLIAVDVGVFNEESMKVTVNDDNKVSNI
                     SKLIDEKESVGCSIDFYKFSKDSSKIFFDEIERIVLRENNKKDWTEIAMQRLFIDRKM
                     KFDVLDISGCSWVEIDNYADLALADKIFSQKNKKISDYKCYCFDLDGTVYIGREPIKE
                     VIDEINSLKKSGKLIRFISNNSSKCKSEYVNKLKNYGIDVSTEDIKISSDSVIDFLNK
                     EQAKKIYVVGTKSLQKNIIDAGFEICSHEPDFIVLGYDTELTYSKLVTACRLINCGVD
                     YIATHCDVFCPSENGPIPDIGTVVTMLEMTTGRKPYRVFGKPNPDLLNLILNEDRLEK
                     DDLLMIGDRIYTDIQMAENTGI"
     gene            1625..2869
                     /gene="wzx"
     CDS             1625..2869
                     /gene="wzx"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen flippase"
                     /protein_id="AIG62596.1"
                     /translation="MLNFTLIKNIFYLFIVQIINYVAPFLVLPYLSRVLSVDNFGLLM
                     MIISASSIALIVTDYGFSLSGPLFVAVNKHNKVVINQYIGTIYLIKSVLISIIWFLFL
                     FIYFISDNEITSHFSNILWLGVIITTQSFQPIWFFQGLEKMKNITFSLIISKSVYVIL
                     IFCLVKTNHVERVFLALVLSNVVTLVISNYLLYRNGYAIGTPCNKLFRDEIKNSFPFF
                     LSRAAVGVYTSASTFIVGSFAGLNQAAVYSSAEKLYQAGQNALSPISQALYPYLARSG
                     DKKTLYKFVVLFFILLCMICILSSYYSNTIVMLFFGNKYNAASQVLNVFLLSLVITFV
                     SFNFGYPAFAAIKRVQIVNYTVVLGGGLQLLMIIILIVSEKITPLNMARSVLFTETLV
                     LISRLGLYFYLILKNDNVSGLK"
     gene            2866..4005
                     /gene="tagF"
     CDS             2866..4005
                     /gene="tagF"
                     /codon_start=1
                     /transl_table=11
                     /product="CDP-glycerol:poly(glycerophosphate)
                     glycerophosphotransferase"
                     /protein_id="AIG62597.1"
                     /translation="MKKIKFIVKFILGRIVYFLSGFFPRNPNIWVFGSFNGFEDNPKY
                     LYLHVLNKNKEIRSIWIAKDKMELLQARNSGGEAYLFNSIKGFLFTLNAGIYIYSAYI
                     TDISFIASRKAKKVNLWHGIPIKKIEFDITSKPLVNHFSKANFINKMIYAYKHIVPDL
                     LLCPSKYSAEYSFKSAFRISERQCLFARYPRVSALLEMKMNKENKKIILYMPTWRDKK
                     PRFLDTAKIEFEKLNTIMRTYDALFVIKLHPLTIIDRAFEYSNISICNKDSDLNELLF
                     VSDCLVTDYSSVVFDYLHLNKPIIFYNFDMDDYLKGREFYFDYNSIICGEIASDYSSL
                     ESKILAVIKGVDLFERKRKEIYDKFTSETGGNQLIVSSIKKIFEK"
     gene            4017..5219
                     /gene="wzy"
     CDS             4017..5219
                     /gene="wzy"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen polymerase"
                     /protein_id="AIG62598.1"
                     /translation="MITNYSWHMKNNNIVGMFVLIVSSTIMIYDPWFFGVGRGVVIIG
                     SIFTLISFASGRFFIRKEEVKLLLLLMLFTGYSLLPAIYNYTLDTSVFFMYFKMCIYY
                     ILCFGFVRCFLKKEKLLFVLKGSVVIQCVFILLCLISSDVRNFIFSVHTVEDRFLISE
                     QAYRLYFLSSSSFFQLSLFFGFLFHFMVALYKENKVGLLILILIFVCGAFSGRVFFLF
                     AVISVIFYGINIRYVPLYSVIAFVLIFFAIKFSDNIFIRHALEPLINYINTGEFMTAS
                     SDKLVEKMLFIPDFSALVVGDGVYYNDDGSYYMHTDSGIIRQFLYGGGGYFISCMALT
                     WYLLTKVSKNWLEKNKKFTLSTMIIFLIGNIKADVLMYPGIMINLIFILIFASGDNND
                     KSVYRILK"
     CDS             5188..5931
                     /codon_start=1
                     /transl_table=11
                     /product="glycosyltransferase family 25 (LPS biosynthesis
                     protein)"
                     /protein_id="AIG62599.1"
                     /translation="MIKVFIVSLNDSPRRTSISQLCEKYGVEFEFVDAVNGRKLDDKY
                     INKINSYEWTKLKYKRKLGPAEIGCTLSHLSIYKKNINSWFIVLEDDVDFDERFVKLI
                     NTDITGLRTDALYLLGGQEGLESFKNVIFSRFGGIKVNTDIVFLKTICSERFLFRTCC
                     YMIHHSLAKRIVDFNQNKLTLADDWFNLSKFNIINNIYYIDLVAHPSDLSTSLIHSDR
                     LKLIKTRQKIYRTFMKRIVLKFRSILRFF"
     gene            5934..6827
                     /gene="wfgD"
     CDS             5934..6827
                     /gene="wfgD"
                     /codon_start=1
                     /transl_table=11
                     /product="UDP-Glc:alpha-D-GlcNAc-diphosphoundecaprenol
                     beta-1,3-glucosyltransferase WfgD"
                     /protein_id="AIG62600.1"
                     /translation="MFSIIIPSFNREKQLRNALAELNKQLYKEFEVIVVDDASDSEYD
                     LSLDLYHFELKYFRSNKNLGPSGARNIGVSLARYDWIIFLDDDDYFMPEKLMILRETI
                     LNNDIDFIYHRARIFMINEKISYISSQKDIDKIPGEVYQHMLSGNFIGGPPNFAIKKS
                     LFNKLHGFSDNVRAIEDYEFLLRLVRDVSKKRIIFVDNVLTGCNYITKSTSVSKNINN
                     LKQACTYISEEYIRDNREFNLFKINTSLMTAHALLMSLNRKSAFFYFKAGIVMFNIKY
                     IVSGIISFISPVLMIKLRGKV"
     gene            6836..7681
                     /gene="wbbD"
     CDS             6836..7681
                     /gene="wbbD"
                     /codon_start=1
                     /transl_table=11
                     /product="UDP-Gal:alpha-D-GlcNAc-diphosphoundecaprenol
                     beta-1,3-galactosyltransferase"
                     /protein_id="AIG62601.1"
                     /translation="MSDGKENRYQKFSVLMSVYYKEDPVFFYDAVNSVFENTIPPDDV
                     VIVVDGPIPEGITDVILALREKYELNIVYLSENVGLGRALNIGINYCKYNIIMRMDTD
                     DYCINTRFQEQLDFFSTHSDVVLLGGDIAEFDTELSNFIGVRHVAYTCQSIRDMAKKR
                     NPFNHMTVAFKKDIILSVGGYKHHPFMEDYNLWLRVIAKGHKVANLPKVMVNVRAGES
                     MLSRRKGIRYIYSEYVLYKLKLSLGIDSLVHGFFIFLMRAFIRVIPIGMISILYSSIR
                     KNMNT"
ORIGIN      
        1 ctctggtagc tgttaagcca agggcggtag ctatttgaaa tcatactaaa acaatatgat
       61 tattgcaaaa ggtgataaaa tgatcggtat tattctggct gcaggtgttg ggtctcgttt
      121 gcggcccatg actaatagta aacctaagtg tcttgtgact gttgctggca agcctatctt
      181 ggattatcag ttaaactcct atcgcctggc tggtattaaa gatatattca ttatcgttgg
      241 ctatgaaggt gataaaataa aaaaacactg taaatatatt aaagatctca atattacaat
      301 aatagaaaat aatgagtatg aagatacaaa taatatgtat tcactctttt tagcaaagaa
      361 atatgcatat ggagaatctt ttatattaaa caatgcagat ctagctattg attcaaatat
      421 catagagaag atttgtgaga gtcccttcag tgatcttatt gctgttgatg taggtgtttt
      481 taatgaagag tctatgaagg ttactgtaaa tgatgataat aaagtaagta atatatctaa
      541 attaatcgat gagaaagagt ctgttggttg ttctatagat ttctataaat tctctaagga
      601 ttcaagtaag atctttttcg atgaaataga acgaattgtg ttaagggaga ataataagaa
      661 agattggact gagattgcaa tgcaacgtct atttatagat agaaagatga aatttgatgt
      721 tttagatatt agtggatgtt cttgggttga aattgataac tatgctgatt tagctcttgc
      781 tgataaaata ttctcccaaa aaaataaaaa aatatcggat tataaatgtt attgctttga
      841 tttagatggt acagtctata tcggtagaga gccgattaaa gaagttatag atgaaataaa
      901 tagtttaaag aaaagtggta agttgattcg ttttatatca aataactcat caaagtgcaa
      961 aagtgaatat gtaaataagc tgaaaaatta cggtatagat gtatctactg aagatattaa
     1021 aatatcatct gatagtgtta ttgacttttt aaataaagaa caggctaaga aaatatacgt
     1081 agttggaaca aaaagtttac aaaaaaatat aattgatgct ggttttgaga tttgttctca
     1141 tgaacctgat tttatcgtgt taggatatga tactgagctg acttattcaa aattggtaac
     1201 cgcttgtagg ttaattaatt gcggggttga ttatattgca acgcattgtg atgttttttg
     1261 tccctcggaa aacggtccta ttcctgatat tggcactgta gtaacaatgc tagaaatgac
     1321 aactgggcgt aagccgtacc gcgtatttgg taaaccaaat ccagatcttt tgaatttgat
     1381 tcttaatgaa gatcggctgg aaaaagatga tttgttaatg attggtgacc gtatatatac
     1441 agatattcaa atggctgaaa atacaggcat ttgaatatct gtatattgta ttaacaggtg
     1501 acactaaacg agaggatatt gaggattcat ctgtaaaacc gacctatata cttcaacact
     1561 tctctcagta attttcaaca aataataacc tgtgttttaa gcaggttatt atttggagcc
     1621 atctatgtta aattttaccc ttattaagaa tatattttat ttatttattg tacaaataat
     1681 aaattatgtt gctccttttt tagttttgcc ttatttaagc agggttctgt ccgttgataa
     1741 ctttggcttg ctaatgatga taatatctgc aagttctatc gcattaattg ttactgatta
     1801 tggattcagc ttatcaggac ctctgtttgt ggctgtaaat aagcataata aagtagtcat
     1861 aaatcaatat attggaacaa tatatttaat aaaatctgtg ttaattagca ttatatggtt
     1921 tttatttctc tttatatatt ttataagtga taatgaaatt acatcacatt tttcaaatat
     1981 attatggcta ggtgttataa taacaacgca gtcctttcag cctatttggt tttttcaagg
     2041 gctagagaaa atgaaaaata taacgttttc tctaattata tccaaatcgg tttatgtaat
     2101 attgattttc tgccttgtaa agacaaatca cgtggagcgt gttttcttgg cattagttct
     2161 aagtaatgtt gtaactttag taataagtaa ttatctttta tatcgtaatg gttatgcgat
     2221 aggcacacct tgtaataaat tatttagaga cgaaataaaa aatagttttc catttttttt
     2281 atcaagggct gctgtaggtg tttatactag tgcaagtacc tttattgttg gcagttttgc
     2341 agggttaaat caggcggctg tttattcgag cgctgaaaaa ttatatcaag cagggcaaaa
     2401 tgctttatcg ccaatatctc aagcattata tccatattta gcaaggtctg gcgataagaa
     2461 aactttatat aaatttgttg ttctgttttt tattttgctc tgtatgatat gcatattgag
     2521 ttcgtattat tctaatacca tagtgatgtt attttttggt aataaatata atgcggcatc
     2581 tcaggtttta aatgtttttc tcttaagtct tgttattact tttgttagtt ttaactttgg
     2641 atatcctgct tttgctgcaa ttaaacgagt gcagattgtc aattatacag ttgttctggg
     2701 gggcggattg cagttattga tgattatcat tttaatagtt agtgaaaaga tcacccctct
     2761 gaatatggct cgaagtgtgt tatttactga aacactagtg ttgatttcta gattaggttt
     2821 atatttctat cttattctca agaatgataa tgtatcaggt ttgaaatgaa aaaaataaaa
     2881 tttatcgtta aatttatttt gggacgaatt gtttactttt tgtctggttt ttttcctaga
     2941 aatccaaata tatgggtttt tggtagcttt aatgggtttg aggataatcc aaaatattta
     3001 tatctacatg ttcttaataa gaacaaagaa ataaggtcta tatggatagc aaaagataaa
     3061 atggaattgc tgcaggctag gaatagtgga ggtgaggctt atctttttaa tagcataaaa
     3121 ggatttttgt ttaccttaaa tgcaggtata tatatatatt cagcatatat tacggatata
     3181 tcttttattg catccagaaa agctaaaaaa gttaatttat ggcatgggat tcctatcaag
     3241 aaaatagagt ttgacataac atctaaaccg ttggtcaacc atttctctaa agccaatttt
     3301 ataaataaga tgatatatgc atataagcat atagttcctg atcttcttct ttgtccaagt
     3361 aaatactcgg ctgagtattc atttaaatcg gcattccgta tttctgagag acagtgttta
     3421 tttgcaagat atccaagagt atccgcatta cttgaaatga agatgaataa agagaataaa
     3481 aaaataattc tttatatgcc aacttggcgt gataaaaaac ctaggtttct agatacggca
     3541 aaaattgaat ttgaaaagct gaatactatc atgagaacat atgatgcatt atttgttatt
     3601 aaactacatc cacttactat catagatagg gcttttgaat attctaacat tagtatttgc
     3661 aataaagact ctgatttaaa tgaactttta tttgtttccg attgtctggt aacagattat
     3721 tcgtctgttg tttttgatta tctacatttg aataaaccaa taatatttta taattttgat
     3781 atggatgact atctgaaagg gcgagagttt tatttcgatt ataactcaat aatttgtggc
     3841 gaaattgctt ctgattattc tagtttagag tcaaaaattt tagctgtgat taaaggagta
     3901 gatcttttcg aaaggaaaag gaaggaaata tacgataaat ttacttctga gactggagga
     3961 aatcaactta tagttagttc tattaagaaa atatttgaaa aataaagaga gttcttatga
     4021 taactaatta ttcgtggcat atgaaaaata ataatattgt aggaatgttt gttttaattg
     4081 tttcttctac aatcatgata tatgacccat ggttttttgg tgtggggcgt ggtgttgtta
     4141 ttattggtag catttttact ttaatttctt ttgcatcagg acgatttttt ataaggaaag
     4201 aggaagttaa acttctgctg cttttaatgc tttttacagg atattcctta ttgcctgcaa
     4261 tatataatta taccttagat actagtgtat tttttatgta ttttaaaatg tgcatatatt
     4321 atatactttg ttttgggttt gttaggtgtt ttttgaaaaa agaaaaacta ttattcgttt
     4381 tgaaaggaag tgttgttata caatgtgtat ttatcttact gtgtttaatt tcctctgatg
     4441 tccgaaactt tatattttca gttcacactg ttgaagatag atttcttatt tcagagcaag
     4501 catatagact ttatttcctt agctcatcat ctttttttca attaagtctc ttttttggat
     4561 tcttattcca tttcatggtc gctttatata aagaaaataa agttggttta cttattttga
     4621 tattgatttt tgtatgtggg gctttttctg gtcgagtatt tttcctattt gcagttattt
     4681 ctgttatttt ttatggtatt aacataagat atgtgccgtt atatagtgtc atagcttttg
     4741 ttttgatatt ttttgctatc aagttttctg ataatatttt tatcagacat gcactggaac
     4801 cattgattaa ttacattaat acaggagagt tcatgactgc atcaagtgat aagttggtgg
     4861 aaaaaatgtt gtttattcct gatttttctg cattggtcgt aggtgatggt gtatattata
     4921 atgatgatgg ttcctattat atgcatactg attcaggtat cataaggcaa tttttgtatg
     4981 gtggaggggg gtattttata tcttgtatgg cattaacttg gtatttactg actaaagtct
     5041 cgaaaaactg gcttgaaaaa aataaaaaat ttactctatc aacaatgatt atttttttga
     5101 tcggaaatat caaagccgat gttctaatgt atccaggcat tatgataaac ttaattttta
     5161 ttttaatttt cgcgtcagga gataataatg ataaaagtgt ttatcgtatc cttaaatgat
     5221 tcaccgagaa gaacatcaat ctcacaattg tgtgaaaaat acggtgtaga atttgagttt
     5281 gttgatgcag ttaatggcag aaaattagat gataaatata tcaataaaat taattcttat
     5341 gagtggacaa agttaaaata taaaagaaag ttaggtccag cagagattgg ttgtacatta
     5401 agtcatctca gtatatacaa aaaaaatatt aatagttggt tcatcgtttt ggaggatgat
     5461 gtcgattttg atgaacgatt tgtaaaatta atcaatactg atataacggg tttaaggact
     5521 gatgctctct atcttcttgg tggacaagag ggacttgagt catttaaaaa tgttattttt
     5581 tcaagatttg gtggtattaa ggtgaataca gatatagtct ttcttaagac tatctgtagc
     5641 gaacgttttt tatttcgaac atgttgctat atgattcatc attcgttagc taagagaata
     5701 gtagacttca accaaaataa acttacttta gctgatgatt ggtttaattt atcaaagttt
     5761 aatatcataa acaatatata ttatattgat cttgttgcac atcctagtga tctgagtact
     5821 tctctaattc attcagatag gctaaaatta attaaaacaa ggcaaaaaat atatagaaca
     5881 ttcatgaaaa gaatagtatt aaaattcagg tctatattga ggtttttctg atcatgtttt
     5941 caattataat acctagcttt aatcgagaga aacaacttcg taatgcattg gccgaactaa
     6001 ataagcaact ctacaaagag tttgaggtta ttgttgttga tgatgcttcg gacagtgaat
     6061 atgatttgag cttggatctt tatcattttg aactgaaata tttcagaagc aataaaaatt
     6121 taggaccatc aggtgctaga aatattggag ttagtttagc tcgttatgat tggatcattt
     6181 ttttggacga tgatgattat ttcatgccgg aaaaactgat gatcttgagg gagactatat
     6241 taaataatga tattgacttt atctatcata gagctcgtat atttatgatc aatgaaaaaa
     6301 ttagttatat ttctagtcaa aaggatatag ataaaattcc aggtgaggta tatcaacata
     6361 tgctttctgg taattttatt gggggccctc caaattttgc aataaaaaaa agcctgttta
     6421 acaaacttca tggcttttct gataatgtac gagcaattga agattacgag tttctactaa
     6481 ggttagtccg tgacgttagt aaaaaaagaa ttatttttgt agataatgtg cttactgggt
     6541 gtaattatat aacaaaatca accagcgtct ctaaaaatat aaacaattta aaacaagcat
     6601 gtacatatat atcagaggaa tatattcgag acaaccgcga atttaatctt tttaaaataa
     6661 atacctcgtt gatgactgca catgctttgt taatgagttt aaatcgtaaa tctgctttct
     6721 tttattttaa agcagggatt gttatgttta atataaaata tattgtatct gggattattt
     6781 cttttatatc tccagttttg atgattaagc ttaggggcaa ggtttaataa taattatgag
     6841 tgatgggaaa gaaaatagat atcaaaagtt tagtgttttg atgtccgtgt actataaaga
     6901 agatcctgtt tttttttatg atgctgttaa tagtgtgttt gagaatacga tccctccaga
     6961 tgatgtggtc attgtcgttg atggacctat cccggaaggt attacggatg taatattggc
     7021 gctaagagaa aaatatgaac ttaacattgt ttatttaagt gaaaacgttg gtttagggag
     7081 agctctaaat ataggtatta attattgcaa atataatatt ataatgagaa tggatactga
     7141 tgattattgt attaatacaa ggtttcagga acaattagat tttttttcca ctcattctga
     7201 tgtggtcctt cttggtgggg atattgcaga attcgatact gaactgtcta attttattgg
     7261 tgttagacac gtggcgtata cttgtcaaag tataagggat atggcaaaaa agagaaatcc
     7321 atttaatcat atgacagtag ccttcaaaaa agatataata ttatcagttg gtggatataa
     7381 gcatcatcca tttatggagg attacaatct ttggttaaga gttattgcta agggacataa
     7441 agtagctaat cttcctaaag ttatggttaa tgtgagagct ggtgaaagta tgctttctcg
     7501 acgaaaagga attcgctata tatatagtga gtatgtatta tataaactca agttgagctt
     7561 gggtattgat agtttagttc atggtttttt tatatttttg atgcgtgcat ttattcgtgt
     7621 tattccaatt ggtatgataa gtatcttata ttcaagtata agaaaaaaca tgaatacata
     7681 agtggatgtt aaagatgata ttacacgttt tgtatattta aatcgtgatt tgctataata
     7741 gaaatgctca cataatctat cagatctatt aaactaagta agcccccctg acaggagtaa
     7801 acaatgtcaa agcaacagat cggcgtcgtc ggtatggcag tg
//