Detection of hydin Gene Duplication in Personal Genome Sequence Data

  • Kim, Jong-Il (Genomic Medicine Institute, Medical Research Center, Seoul National University) ;
  • Ju, Young-Seok (Genomic Medicine Institute, Medical Research Center, Seoul National University) ;
  • Kim, Shee-Hyun (Macrogen Inc.) ;
  • Hong, Dong-Wan (Genomic Medicine Institute, Medical Research Center, Seoul National University) ;
  • Seo, Jeong-Sun (Genomic Medicine Institute, Medical Research Center, Seoul National University)
  • Published : 2009.09.30


Human personal genome sequencing can be done with high efficiency by aligning a huge number of short reads derived from various next generation sequencing (NGS) technologies to the reference genome sequence. One of the major obstacles is the incompleteness of human reference genome. We tried to analyze the effect of hidden gene duplication on the NGS data using the known example of hydin gene. Hydin2, a duplicated copy of hydin on chromosome 16q22, has been recently found to be localized to chromosome 1q21, and is not included in the current version of standard human genome reference. We found that all of eight personal genome data published so far do not contain hydin2, and there is large number of nsSNPs in hydin. The heterozygosity of those nsSNPs was significantly higher than expected. The sequence coverage depth in hydin gene was about two fold of average depth. We believe that these unique finding of hydin can be used as useful indicators to discover new hidden multiplication in human genome.



  1. Ahn, S.M., Kim, T.H., Lee, S., Kim, D., Ghang, H., Kim, D.S., Kim, B.C., Kim, S.Y., Kim, W.Y., Kim, C., Park, D., Lee, Y.S., Kim, S., Reja, R., Jho, S., Kim, C.G., Cha, J.Y., Kim, K.H., Lee, B., Bhak, J., Kim, S.J. (2009). The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res. 19,1622-1629
  2. Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., Boutell, J.M., Bryant, J., Carter, R.J., Keira Cheetham, R., Cox, A.J., Ellis, D.J., Flatbush, M.R., Gormley, N.A., Humphray, S.J., Irving, L.J., Karbelashvili, M.S., Kirk, S.M., Li, H., Liu, X., Maisinger, K.S., Murray, L.J., Obradovic, B., Ost, T., Parkinson, M.L., Pratt, M.R., Rasolonjatovo, I.M., Reed, M.T., Rigatti, R., Rodighiero, C., Ross, M.T., Sabot, A., Sankar, S.V., Scally, A., Schroth, G.P., Smith, M.E., Smith, V.P., Spiridou, A., Torrance, P.E., Tzonev, S.S., Vermaas, E.H., Walter, K., Wu, X., Zhang, L., Alam, M.D., Anastasi, C., Aniebo, I.C., Bailey, D.M., Bancarz, I.R., Banerjee, S., Barbour, S.G., Baybayan, P.A., Benoit, V.A., Benson, K.F., Bevis, C., Black, P.J., Boodhun, A., Brennan, J.S., Bridgham, J.A., Brown, R.C., Brown, A.A., Buermann, D.H., Bundu, A.A., Burrows, J.C., Carter, N.P., Castillo, N., Chiara E Catenazzi, M., Chang, S., Neil Cooley, R., Crake, N.R., Dada, O.O., Diakoumakos, K.D., Dominguez-Fernandez, B., Earnshaw, D.J., Egbujor, U.C., Elmore, D.W., Etchin, S.S., Ewan, M.R., Fedurco, M., Fraser, L.J., Fuentes Fajardo, K.V., Scott Furey, W., George, D., Gietzen, K.J., Goddard, C.P., Golda, G.S., Granieri, P.A., Green, D.E., Gustafson, D.L., Hansen, N.F., Harnish, K., Haudenschild, C.D., Heyer, N.I., Hims, M.M., Ho, J.T., Horgan, A.M., Hoschler, K., Hurwitz, S., Ivanov, D.V., Johnson, M.Q., James, T., Huw Jones, T.A., Kang, G.D., Kerelska, T.H., Kersey, A.D., Khrebtukova, I. Kindwall, A.P., Kingsbury, Z., Kokko-Gonzales, P.I., Kumar, A., Laurent, M.A., Lawley, C.T., Lee, S.E., Lee, X., Liao, A.K., Loch, J.A., Lok, M., Luo, S., Mammen, R.M., Martin, J.W., McCauley, P.G., McNitt, P., Mehta, P., Moon, K.W., Mullens, J.W., Newington, T., Ning, Z.,Ling Ng, B., Novo, S.M., O'Neill, M.J., Osborne, M.A., Osnowski, A., Ostadan, O.,Paraschos, L.L., Pickering, L., Pike, A.C., Pike, A.C., Chris Pinkard, D., Pliskin, D.P., Podhasky, J., Quijano, V.J., Raczy, C., Rae, V.H., Rawlings, S.R., Chiva Rodriguez, A., Roe, P.M., Rogers, J., Rogert Bacigalupo, M.C., Romanov, N., Romieu, A., Roth, R.K., Rourke, N.J., Ruediger, S.T., Rusman, E., Sanches-Kuiper, R.M., Schenker, M.R., Seoane, J.M., Shaw, R.J., Shiver, M.K., Short, S.W., Sizto, N.L., Sluis, J.P., Smith, M.A., Ernest Sohna Sohna, J., Spence, E.J., Stevens, K., Sutton, N., Szajkowski, L., Tregidgo, C.L., Turcatti, G., Vandevondele, S., Verhovsky, Y., Virk, S.M., Wakelin, S., Walcott, G.C., Wang, J., Worsley, G.J., Yan, J., Yau, L., Zuerlein, M., Rogers, J., Mullikin, J.C., Hurles, M.E., McCooke, N.J., West, J.S., Oaks, F.L., Lundberg, P.L., Klenerman, D., Durbin, R., and Smith, A.J. (2008). Accurate whole human genome sequencing using reversible reversible terminator chemistry. Nature 456, 53-59
  3. Davy, B.E., and Robinson, M.L. (2003). Congenital hydrocephalus in hy3 mice is caused by a frameshift mutation in hydin, a large novel gene. Hum. Mol. Genet. 12, 1163-1170
  4. Doggett, N.A., Xie, G., Meincke, L.J., Sutherland, R.D.,Mundt, M.O., Berbari, N.S., Davy, B.E., Robinson, M.L.,Rudd, M.K., Weber, J.L., Stallings, R.L., and Han, C.(2006). A 360-kb interchromosomal duplication of the human hydin locus. Genomics 88, 762-771
  5. International Human Genome Sequencing Consortium (2004). Finishing the euchromatic sequence of the human genome. Nature 431, 931-945
  6. Kim, J.I., Ju, Y.S., Park, H., Kim, S., Lee, S., Yi, J.H., Mudge, J., Miller, N.A., Hong, D., Bell, C.J., Kim, H.S., Chung, I.S., Lee, W.C., Lee, J.S., Seo, S.H., Yun, J.Y., Woo, H.N., Lee, H., Suh, D., Lee, S., Kim, H.J., Yavartanoo, M., Kwak, M., Zheng, Y., Lee, M.K., Park, H., Kim, J.Y., Gokcumen, O., Mills, R.E., Zaranek, A.W., Thakuria, J., Wu, X., Kim, R,W., Huntley, J.J., Luo, S., Schroth, G.P., Wu, T.D., Kim, H., Yang, K.S., Park, W.Y., Kim, H., Church, G.M., Lee, C., Kingsmore, S.F., and Seo J.S. (2009). A highly annotated whole-genome sequence of a Korean individual. Nature 460, 1011-1015es, B. W., King, S. A., Allan, P. W., Parker, W. B. and Sorscher, E. J. 1998. Cell to cell contact is not required for bystander cell killing by Escherichia coli purine nucleoside phosphorylase. J. Biol. Chem. 273:2322-2328.Ā樀1〮㄰㜴⽪扣⸲㜳⸴⸲㌲㈀F范ᤀ䑍䩇䑁弲〱た瘵㉮㍟ㄷ㕟〲㠁2贀褈돀䠫?⨀塨?⨀᣶?⨀愈돐?잖⨀?잖⨀/ࡅڗ⨀℈덐 塞?⨀C慮捥爠剥献Ȁ㔵ऀ㌳㌹ⴳ㌴㔊1㤹㔭〰ⴰ ”Ȃ鈀ĂಈȀĀ沗⤀ʺ最ʺ昀㔀㔃最Ā䠀甀最栀攀猀⃽䤿픈吀ʺ昃꤁䀀Ȃ㸉⸀Ā䠀甀最栀攀猀Ⰰ 䈀⸀ 圀⸀Ⰰ 圀攀氀氀猀Ⰰ 䄀⸀ 䠀⸀Ⰰ 䈀攀戀漀欀Ⰰ 娀⸀Ⰰ 䜀愀搀椀Ⰰ 嘀⸀ 䬀⸀Ⰰ 䨀爀⸀ 䜀愀爀瘀攀爀Ⰰ 刀⸀ 䤀⸀Ⰰ 倀愀爀欀攀爀Ⰰ 圀⸀ 䈀⸀ 愀渀搀 匀漀爀猀挀栀攀爀Ⰰ 䔀⸀ 䨀⸀ ㄀㤀㤀㔀⸀ 䈀礀猀琀愀渀搀攀爀 欀椀氀氀椀渀最 漀昀 洀攀氀愀渀漀洀愀 挀攀氀氀猀 甀猀椀渀最 琀栀攀 栀甀洀愀渀 琀礀爀漀猀椀渀愀猀攀 瀀爀漀洀漀琀攀爀 琀漀 攀砀瀀爀攀猀猀 琀栀攀
  7. Levy, S., Sutton, G., Ng, P.C., Feuk, L., Halpern, A.L., Walenz, B.P., Axelrod, N., Huang, J., Kirkness, E.F., Denisov, G., Lin, Y., MacDonald, J.R., Pang, A.W., Shago, M., Stockwell, T.B., Tsiamouri, A., Bafna, V., Bansal, V., Kravitz, S.A., Busam, D.A., Beeson, K.Y., McIntosh, T.C., Remington, K.A., Abril, J.F., Gill, J., Borman, J., Rogers, Y.H., Frazier, M.E., Scherer, S.W., Strausberg, R.L., and Venter, J.C. (2007). The diploid genome sequence of an individual human. PLoS. Biol. 5 e254
  8. McKernan, K.J., Peckham, H.E., Costa, G.L., McLaughlin, S.F., Fu, Y., Tsung, E.F., Clouser, C.R., Duncan, C., Ichikawa, J.K., Lee, C.C., Zhang, Z., Ranade, S.S., Dimalanta, E.T., Hyland, F.C., Sokolsky, T.D., Zhang, L. Sheridan, A., Fu, H., Hendrickson, C.L., Li, B., Kotler, L., Stuart, J.R., Malek, J.A., Manning, J.M., Antipova, A.A., Perez, D.S., Moore, M.P., Hayashibara, K.C., Lyons, M.R., Beaudoin, R.E., Coleman, B.E., Laptewicz, M.W., Sannicandro, A.E., Rhodes, M.D., Gottimukkala, R.K., Yang, S., Bafna, V., Bashir, A., MacBride, A., Alkan, C., Kidd, J.M., Eichler, E.E., Reese, M.G., De La Vega, F.M., and Blanchard, A.P. (2009). Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 19, 1527-1541
  9. Tucker, T., Marra, M., and Friedman, J.M. (2009). Massively parallel sequencing: the next big thing in genetic medicine. Am. J. Hum. Genet. 85, 142-154
  10. Wang, J., Wang, W., Li, R., Li, Y., Tian, G., Goodman, L., Fan, W., Zhang, J., Li, J., Zhang, J., Guo, Y., Feng, B., Li, H., Lu, Y., Fang, X., Liang, H., Du, Z., Li, D., Zhao, Y., Hu, Y., Yang, Z., Zheng, H., Hellmann, I., Inouye, M., Pool, J., Yi, X., Zhao, J., Duan, J., Zhou, Y., Qin, J., Ma, L., Li, G., Yang, Z., Zhang, G., Yang, B., Yu, C., Liang, F., Li, W., Li, S., Li, D., Ni, P., Ruan, J., Li, Q., Zhu, H., Liu, D., Lu, Z., Li, N., Guo, G., Zhang, J., Ye, J., Fang, L., Hao, Q., Chen, Q., Liang, Y., Su, Y., San, A., Ping, C., Yang, S., Chen, F., Li, L., Zhou, K., Zheng, H., Ren, Y., Yang, L., Gao, Y., Yang, G., Li, Z., Feng, X., Kristiansen, K., Wong, G.K., Nielsen, R., Durbin, R.,Bolund, L., Zhang, X., Li, S., Yang, H., and Wang, J. (2008). The diploid genome sequence of an Asian individual. Nature 456, 60-65
  11. Wheeler, D.A., Srinivasan, M., Egholm, M., Shen, Y., Chen, L., McGuire, A., He, W., Chen, Y.J., Makhijani, V., Roth, G.T., Gomes, X., Tartaro, K., Niazi, F., Turcotte, C.L,. Irzyk, G.P., Lupski, J.R., Chinault, C., Song, X.Z., Liu, Y., Yuan, Y., Nazareth, L., Qin, X., Muzny, D.M., Margulies, M., Weinstock, G.M., Gibbs, R.A., and Rothberg, J.M. (2008). The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872-876
  12. Yngvadottir, B., Macarthur, D.G., Jin, H., and Tyler-Smith, C. (2009). The promise and reality of personal genomics. Genome Biol. 10, 237

Cited by

  1. Extensive genomic and transcriptional diversity identified through massively parallel DNA and RNA sequencing of eighteen Korean individuals vol.43, pp.8, 2011,
  2. Using population admixture to help complete maps of the human genome vol.45, pp.4, 2013,