LOCUS       KJ778799                7460 bp    DNA     linear   BCT 29-MAR-2016
DEFINITION  Escherichia coli strain E54071-88 serotype O178:H7 O-antigen gene
            cluster, complete sequence.
ACCESSION   KJ778799
VERSION     KJ778799.1
KEYWORDS    .
SOURCE      Escherichia coli
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacterales;
            Enterobacteriaceae; Escherichia.
REFERENCE   1  (bases 1 to 7460)
  AUTHORS   DebRoy,C., Fratamico,P.M., Yan,X., Baranzoni,G., Liu,Y.,
            Needleman,D.S., Tebbs,R., O'Connell,C.D., Allred,A., Swimley,M.,
            Mwangi,M., Kapur,V., Raygoza Garay,J.A., Roberts,E.L. and Katani,R.
  TITLE     Comparison of O-Antigen Gene Clusters of All O-Serogroups of
            Escherichia coli and Proposal for Adopting a New Nomenclature for
            O-Typing
  JOURNAL   PLoS ONE 11 (1), E0147434 (2016)
   PUBMED   26824864
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 7460)
  AUTHORS   Yan,X., Fratamico,P.M., Tebbs,R.S., O'Connell,C.D., Swimley,M.,
            Baranzoni,G.M., Debroy,C. and Liu,Y.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-APR-2014) Molecular Characterization of Foodborne
            Pathogens Research Unit, USDA-ARS, 600 East Mermaid Lane, Wyndmoor,
            PA 19038, USA
COMMENT     ##Assembly-Data-START##
            Assembly Method       :: CLC Genomics Workbench v. 7.0
            Coverage              :: >50X
            Sequencing Technology :: IonTorrent
            ##Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..7460
                     /organism="Escherichia coli"
                     /mol_type="genomic DNA"
                     /strain="E54071-88"
                     /serotype="O178:H7"
                     /db_xref="taxon:562"
     misc_feature    1..7460
                     /note="O-antigen gene cluster"
     CDS             60..662
                     /codon_start=1
                     /transl_table=11
                     /product="haloacid dehalogenase-like hydrolase"
                     /protein_id="AIG62751.1"
                     /translation="MKKILLIDICGTITEKNTTLDFISYIGNKPSILKIIIGRICWRL
                     FKKDTIRKWYLNELKGLSYTELVNLAREYVNQLNYDPQIICLINKHRGECDLYFVSAT
                     LDFIAKAMTEKFNVSGFFASELQFEDGICSGKLKIDLLDNKNSVIAHMFNGDNNVCII
                     SDNYGDYSVMEKCCSSWAVVRNKRALHYWSKKSVNIINRM"
     CDS             665..1447
                     /codon_start=1
                     /transl_table=11
                     /product="membrane protein"
                     /protein_id="AIG62750.1"
                     /translation="MYIFICPFAYTYNTRLRSLISLLSWGVIYFLLLLLFVMVQNDGI
                     NYTEIMIFCVSIIIVYNNYEIGYIINDTETIKKEKKPTMRLGRNALDIYESNKLLIYG
                     IRVCLSIVLTLFLIKILKVQGYYIISAWLILPVFIIYNSIRNRLNLALHFTLVTLRYC
                     SPILIVLQPEHIINVILNVSFIFPLPNLIERCSEPRFDFSIFHKVKSNHNKWRVFYYF
                     MATTITGLALYTKMLSEYVVFICCLYMLVYRVLSPLLIQKKN"
     gene            1966..3036
                     /gene="pglJ"
     CDS             1966..3036
                     /gene="pglJ"
                     /codon_start=1
                     /transl_table=11
                     /product="N-acetylgalactosamine-N,
                     N'-diacetylbacillosaminyl-diphospho-undecaprenol
                     4-alpha-N-acetylgalactosaminyltransferase"
                     /protein_id="AIG62749.1"
                     /translation="MKKVTFLINSLTQGGAEKIALQLYLNLSNSVNVINFVTLTNEDF
                     YKKEIKKHVINHSNKITKTTKFMSLAKYFRGNNDVVLCFSLDLACYLYILRVLNIFKG
                     EVICRFINNPDNEIGTSLINKIKKRFLFYVLKKCDLVICQSDAMKDLLISKYEFNNNK
                     TVRIYNPISIANNNESSFEQTDKEILKLLFVGRLSEQKNLDDIIRISQLLRDKGILFL
                     WKVVGKGPLENTFKDLIREHNFQESIVMCGSSHDIENYYKWADITTLVSHYEGLPNVL
                     LESISHRTPCVSYDCPTGPSEIIIHEKNGVLVPQYDVDNFYQSLIRVKELNLKNDNIE
                     NTLMKFSEKKVCDEYYKILFKR"
     gene            3042..4271
                     /gene="wzx"
     CDS             3042..4271
                     /gene="wzx"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen flippase"
                     /protein_id="AIG62748.1"
                     /translation="MKKVLGNFLNLTLLQLSTMVAQILTYPYLNRTLGSEIYGEIILQ
                     QIIVFHLQMIVFFGTDLSVVRLAAKYSTSYKKLKILVSKIILGRFYISLIISLSYVIY
                     IYCSGLNDWFYFFVFSIFEAVITLRWFYHGIQQLNKFTLPFCLVRFLGVLLLFIFVNT
                     QDDWYKVIIITVLSSLLSVIVSFYSYYVKYNLTYIKFRQIYIIFSDSFSLFATNIVSV
                     IKDRSGGIFISYFMGTSSLVYYDFCMKIVGVISSITSSISASLYPMFSGNYNHSTFKR
                     YNLLILLFSIIPFIITLLADKYVIFLIDYIFSINIFDIKLLLPVFGFMIFVRSHGYFI
                     GLCYLMALNYKKSYASSLIYSGMFYLFFMSGMIFLGYKNILTLAVGLSLSLLLEYFHR
                     IYKFISTRKNANESHAE"
     gene            4252..5349
                     /gene="wzy"
     CDS             4252..5349
                     /gene="wzy"
                     /codon_start=1
                     /transl_table=11
                     /product="O-antigen polymerase"
                     /protein_id="AIG62747.1"
                     /translation="MNLTLNNFMAYILAITMVIQSYSKVFFASDTIFQNIFFYALVFF
                     SILIVSISKFRLIDIFLIITSLIIYLAFGNGFALKLFLVSLAVRCLDINKLLRCYLLL
                     ATIAFITVILFNVGGDNSLIYFKGDGVFRIARETLGFDNPNKPFYYLLPIFCCFVFLY
                     FKRYPISCITAIIIVTYAVYIKTLTTTGLFSNALLVLMLLIYYFLPNKTNRILGNPLL
                     ITCSIIVLYFASFFIAFTYHSDSNVNFFLSHRPEYWYEIIRTTNIYLLIFGQALDLQL
                     VPLDNSYIHSVIYMGAFFCIIMIFCYWLGLTRAKLRGVNISIISILCIYVFFYSFGET
                     LLVEPTLNITFIIIFNYIRDHNKYESIDLQP"
     gene            5324..6475
                     /gene="pimB"
     CDS             5324..6475
                     /gene="pimB"
                     /codon_start=1
                     /transl_table=11
                     /product="GDP-mannose-dependent
                     alpha-(1-6)-phosphatidylinositol monomannoside
                     mannosyltransferase"
                     /protein_id="AIG62746.1"
                     /translation="MKVLIFSHEYPPIGGGAGVVANQLIDHFCEDSAIEQIDLLTRFS
                     HQYNINKRVHNIFMVNIHNIIWPLEYFFYLKKTINLSLYDLIICNDSISIYIAGMLFS
                     SRELEKTVCFLHGSEPEFIYQNNNIQKRLLNLRFFFHRALIKCSFIGAHSEFMKEKFL
                     KNLPKKLTINSDKIIPLYFGYDASLFNTDNKIENKNTIRNKYKIKESDILLLTVSRVE
                     EKKGFVKMLNTFEIMKSMNENLKWMIVGDGGFLESLKEIAFRKGIFDSLIIVGKIPRS
                     ELCMFYNSADIFWLLSEYQEAFGLVYIESQACGVPAIGYNNSGVREAVVDGVTGYLIN
                     NLDEMLDIVVTKKFQIITNESLYDFSKKFDSKQCYLKILKAFNEKVINE"
     gene            6468..7307
                     /gene="amsE"
     CDS             6468..7307
                     /gene="amsE"
                     /codon_start=1
                     /transl_table=11
                     /product="amylovoran biosynthesis protein"
                     /protein_id="AIG62745.1"
                     /translation="MSDVIMHDKKTEEFSVLMSLYCNEKPEFLNQCLISIHEQSVKPN
                     EIIIVYDGYIPAELDNVVIEWSKVLPVKVCKLHTNMGLGDALNFGIKHCNNELIARMD
                     TDDICAKDRFKLQLQYFNENPTLTLIGGGIEEYDEEMQVLRGTRFTKEKHADIVKYAC
                     FKNPFNHMTVMFKKKDIQSVGGYKKHLLMEDYNLWLRLLNNGYKTYNLPEILVYARTG
                     INMVRKRRGMSYFKSEIQLFKLKRMLNCNSYSKNCVIFLIRVLPRFLPVSVLSIVYKI
                     MRK"
ORIGIN      
        1 tctggtagct gtaaagccag gggcggtagc gtggaaatta cgagagttat agggtttaaa
       61 tgaaaaaaat tttattaatt gacatatgtg gaacaataac tgagaaaaat accacattag
      121 attttatttc atacattgga aataagccaa gtattctaaa aattattatc ggacgtattt
      181 gctggcgctt atttaaaaaa gatactataa gaaaatggta tctcaatgaa ttgaaaggat
      241 taagttatac ggagttagtt aatttagcaa gggaatatgt gaaccagttg aattatgacc
      301 cccagattat atgtttaata aataaacata gaggcgagtg tgacttatat tttgtgtctg
      361 ctacgttaga ttttatagca aaagcaatga cggaaaaatt taatgtatct ggtttttttg
      421 catcggaatt acaatttgaa gatggtattt gttctggaaa attaaaaata gatttgcttg
      481 ataataaaaa tagcgtgatt gcacatatgt ttaatggtga taataatgtt tgtataatta
      541 gcgataatta tggcgattat agcgttatgg aaaaatgttg tagttcatgg gctgtagtga
      601 gaaataaaag agcattacat tattggagta aaaaatcggt taacatcatc aacaggatgt
      661 aaatatgtat attttcatat gtccgtttgc ttatacttat aatactaggt taagatcatt
      721 aatttcacta ctgtcatggg gggtaattta ttttcttttg ttattattgt ttgttatggt
      781 acaaaatgat ggaattaatt acacagaaat tatgattttt tgtgtttcta ttataattgt
      841 ttataataat tatgaaattg gctatattat taatgatact gaaacgatta agaaagaaaa
      901 aaaaccaaca atgcgtcttg gaagaaatgc actagatata tatgaatcta ataaattatt
      961 aatatatgga attagagtgt gtttatcaat tgtattaaca ttgtttttaa ttaagattct
     1021 taaagtgcaa ggttattata taatatcagc atggctaatc cttcctgttt ttattatata
     1081 taattctata aggaacagat taaatttagc tcttcatttt acattggtta ctttaagata
     1141 ttgttcacct atattaatag tattacagcc agaacatata ataaacgtta ttctcaatgt
     1201 ttcttttatt ttccctttac caaatctaat tgaaagatgt tcagaaccaa gatttgattt
     1261 ttcgattttt cataaagtaa aatcaaatca taacaaatgg agagtctttt attattttat
     1321 ggcgaccact attactggat tagcccttta tacgaaaatg ttatctgaat acgtggtttt
     1381 catatgttgt ttgtatatgc tggtatatag agttttatca cccttattga ttcagaaaaa
     1441 aaattgacta taagattata taaataataa agtaccttat gctaaggaga ttgaactgaa
     1501 gaaggtgatt taaaaccttc tatactatac tttattaaat actaagataa cgcgtaaatg
     1561 gttaaaatga cagtgattaa aatgataaat atgttatctt tcaattaatt cctaatttga
     1621 attctattca gtacctaaat attatcttca tttaaattca atggatggtt ttggaaattt
     1681 aatatgactt attctgattt ctagaaaaat aataagaatt gttattatat cttgggggga
     1741 ttaaagaatt catattagta accaatgaga ataaaataaa agagggaatt aatttacgtt
     1801 attttgataa acatttttca caattaaaat ttttaaatgc gttaagtatt tatcaataca
     1861 aaatataaaa gggttatccc gaaagctact gctcacgtta gtgttgaggg aatatatttt
     1921 ttactcttac aaagagtgat gaacttcggt taaaagtgaa tataaatgaa aaaagttact
     1981 tttcttatta atagcttaac acaaggtgga gcagaaaaaa ttgcattgca actttaccta
     2041 aatttgagta attcagttaa tgttattaat tttgttacat tgacaaatga agatttttat
     2101 aagaaggaaa taaaaaaaca tgtgataaat catagcaata aaataactaa aacaacaaaa
     2161 tttatgtcat tagcaaagta ttttagaggt aataatgacg ttgtcttatg tttttcatta
     2221 gatcttgctt gttatttata tatccttcgc gttctaaata tttttaaagg agaggtgatt
     2281 tgtagattca taaataatcc tgacaatgaa ataggaactt cattaattaa taaaataaaa
     2341 aaaagatttc ttttttatgt gttaaaaaaa tgcgatcttg ttatttgcca atcagatgct
     2401 atgaaggatc ttttgataag caaatatgaa ttcaataata ataaaacagt aaggatttat
     2461 aatccaatat ctattgcaaa taataatgaa tctagtttcg agcaaactga taaagaaatt
     2521 ttaaaattac tttttgttgg aaggcttagc gaacaaaaaa atttagatga tataattcgt
     2581 atttctcaac tattgagaga taaaggaata ttatttttgt ggaaggttgt cggaaaaggg
     2641 ccgctggaaa acacattcaa ggatttgata agagaacaca atttccaaga gagtatcgta
     2701 atgtgtggtt cgtcacatga tatagaaaac tactataagt gggcagatat cacaacttta
     2761 gtttctcatt atgaagggct tccgaatgta ttgttagaat ccatctcgca tagaacgcca
     2821 tgcgtaagtt atgactgtcc cactggtcct tctgaaataa ttattcatga aaaaaatggt
     2881 gttttagtac ctcaatacga tgtggacaat ttttatcaat ctcttattcg agttaaagag
     2941 ctaaatttaa aaaatgataa tattgagaat acattaatga aattttcgga gaaaaaagtt
     3001 tgtgatgaat actataaaat tctttttaag agataatcag attgaagaaa gtattaggaa
     3061 atttcttaaa tttgacactt ttacagttgt caacaatggt tgcacaaatc cttacttacc
     3121 cttacttaaa taggactctg ggcagtgaga tatatgggga aataattctt cagcagataa
     3181 tagtttttca tcttcaaatg atagtttttt ttggtacaga cctgtccgta gtgaggttgg
     3241 cagcaaaata tagtacaagc tataaaaaac tgaaaatttt ggtgagtaaa ataatccttg
     3301 gtcgatttta tatttcttta atcatttctt tatcttacgt tatttatatt tattgtagcg
     3361 gtttaaatga ttggttttat ttttttgtat tttctatttt tgaagcagtt attactttga
     3421 gatggtttta tcatggtatt cagcaactta ataaatttac cttacctttt tgtttagtca
     3481 gatttttagg agtgttgctt ctatttatct ttgttaatac tcaggatgat tggtataagg
     3541 ttataataat aactgtattg tcgtcattac tcagtgttat agtatctttc tattcctact
     3601 atgttaaata taatcttact tatattaagt ttcgtcaaat atatattata ttttctgatt
     3661 cattttcgct ctttgctaca aacattgttt cggtgattaa ggaccgatct ggaggtattt
     3721 ttatcagcta ttttatggga acatcttcac tagtttatta cgatttttgc atgaaaattg
     3781 ttggtgtaat ttcaagtata acttctagca tatcagcttc tttgtacccg atgttctcag
     3841 gaaattataa ccattcaaca ttcaaaagat ataatttact aatattgttg ttttcgataa
     3901 ttccatttat tattactctg ctcgcagata aatatgtgat ttttttgata gattatatat
     3961 ttagtattaa tatatttgat attaaattat tgctgccagt atttggattt atgatttttg
     4021 taagatctca tggttacttc ataggcctct gttatctaat ggcattaaac tataaaaaga
     4081 gttatgcttc atcattgatt tattcgggta tgttttattt gttttttatg agcggtatga
     4141 tttttttagg ctataaaaat attttaacac ttgctgtagg gttatcatta tcattgttac
     4201 tagagtattt tcatcggata tataaattta ttagtacaag gaaaaatgcc aatgaatctc
     4261 acgctgaata attttatggc atatatactc gccattacaa tggtaattca atcttatagt
     4321 aaggttttct ttgcatctga tactatcttt cagaatatct ttttttatgc tttagtgttt
     4381 ttttcaatat taattgtaag tatatcaaag ttcagattga ttgatatatt tttgattatt
     4441 acatcattaa taatttattt agcattcggt aacggctttg cattaaagct tttcttagtt
     4501 tcattagcag tacgttgttt agatataaac aaattgctcc gttgttatct cttactagcg
     4561 actattgcat ttataacagt tatattattc aatgtgggag gtgataactc tttaatttat
     4621 tttaaaggag atggtgtatt tagaattgca agggaaacat taggttttga taatcctaac
     4681 aaaccatttt attatctttt gccgatattt tgttgttttg tttttctata tttcaaacgt
     4741 tatccaatat catgcataac agctattata attgttacat atgcagtata tataaaaaca
     4801 ctaacaacta caggactttt cagtaatgct ttgcttgttt taatgttgtt gatatattat
     4861 tttcttccca ataaaacaaa tcgaatatta ggtaatcctc ttttaattac ttgctcaatt
     4921 attgtattat attttgctag tttttttatt gccttcactt atcattctga tagtaatgtg
     4981 aatttctttc tttctcacag gccagaatat tggtatgaga taattagaac aacaaatatt
     5041 tatttgctaa tatttggaca agcattagat cttcaattgg ttccgttaga taattcctat
     5101 atacattctg tcatttatat gggggcgttt ttctgtatta taatgatttt ttgctattgg
     5161 ctggggttaa ccagagctaa actcagaggg gtaaacattt caattataag tatattatgc
     5221 atatatgtgt ttttttattc ttttggtgag actttgttag ttgagccaac actcaacatc
     5281 acatttatta ttatatttaa ctatatacgg gatcataata aatatgaaag tattgatctt
     5341 cagccatgag tatccaccta taggaggagg ggctggtgta gtagctaatc aactaatcga
     5401 ccatttttgt gaagacagcg cgatcgaaca aattgatctg cttactcgtt tttcgcatca
     5461 gtacaatatt aataagagag tgcataatat ttttatggtt aacatccata atattatatg
     5521 gcctctagag tacttttttt atcttaagaa aactataaat ctttctcttt atgatcttat
     5581 tatctgtaat gactccatat ctatatatat tgcaggtatg ctgttctcta gtagagaact
     5641 agaaaaaacc gtatgttttt tacatggttc tgaacctgag tttatatatc aaaataataa
     5701 tatccaaaaa aggttactta atttaaggtt tttttttcac agagctctaa ttaagtgtag
     5761 ctttatcgga gctcatagtg aatttatgaa agaaaaattc ctgaagaact tgcctaaaaa
     5821 attaactatt aattcagata aaataatccc cttatacttt ggctatgatg catcattgtt
     5881 caatactgat aataaaatag aaaataaaaa tacaattcgt aataaatata aaataaaaga
     5941 aagtgatata ttattactta ctgtttcaag ggtggaagaa aaaaagggat ttgtaaaaat
     6001 gttaaatacc tttgaaataa tgaaatccat gaatgagaat ctaaaatgga tgatagttgg
     6061 ggatggggga ttccttgaaa gtttgaaaga aattgcattt aggaaaggga tttttgattc
     6121 actcattatc gtaggtaaaa ttccaaggag tgagttgtgc atgttttata attctgcaga
     6181 tattttttgg ttattatcag aataccaaga ggcttttggt ttggtatata ttgaatctca
     6241 ggcatgtggg gtacctgcca ttggctataa taattcagga gtaagagagg ctgttgttga
     6301 tggagttacg gggtatttaa ttaataatct tgatgaaatg ttggatatcg tagtgacaaa
     6361 gaaattccaa attataacta atgaatccct atatgatttt tcaaaaaaat tcgactcaaa
     6421 acaatgttac ttaaaaatac taaaagcgtt taacgagaaa gttatcaatg agtgatgtta
     6481 taatgcacga taaaaaaact gaggaattct cagtattaat gtctttatat tgtaacgaaa
     6541 aacctgaatt tttaaatcaa tgcttaataa gtatccatga gcaatctgtc aagcctaatg
     6601 agatcatcat tgtttatgat ggctatatac cggctgaatt agataatgtc gtaatagagt
     6661 ggagtaaagt tttaccagta aaagtttgta agttacacac aaatatgggg cttggcgatg
     6721 cacttaactt tggtatcaaa cattgtaata atgaattgat cgcaagaatg gatactgacg
     6781 atatttgtgc taaggataga ttcaaattgc aactgcaata tttcaatgaa aaccctactc
     6841 ttactttaat cgggggaggt attgaagaat atgatgagga aatgcaagtc ctgagaggaa
     6901 cgcgttttac aaaagaaaaa catgctgata ttgtcaaata tgcttgcttc aaaaacccat
     6961 tcaatcatat gactgtaatg tttaagaaaa aagatattca gtctgtcggt ggatataaaa
     7021 aacatctatt aatggaagat tataatcttt ggttgcgtct tttaaataat gggtataaaa
     7081 catataactt accagaaata cttgtttacg cgcgaacggg tattaacatg gttagaaaac
     7141 gtagaggtat gagttatttt aaaagtgaaa ttcagctttt taaactgaaa aggatgttga
     7201 actgtaattc atatagtaaa aactgtgtta tctttttaat tagagtactc ccacgattcc
     7261 ttccggtatc tgtactttca attgtatata aaataatgcg aaagtgacaa acattttttt
     7321 ataaagagaa aattaattgc ttcttccttt tttgaaaaca ataaatattg tatattactg
     7381 ttcttttaga actacttccc cgcagacagg agtaaacaat gtcaaagcaa cagatcggcg
     7441 tcgtcggtat ggcagtgatg
//