ایجاد درخت ژنی با استفاده از واگرایی کولبک-لیبلر روی ژن‌های موثر بر تولید شیر در گاو شیری

دهقان زاده, هوشنگ; میرحسینی, سید ضیاء الدین; قادری زفره‌یی, مصطفی; توکلی, حسن; اسماعیل خانیان, سعید

doi:10.22124/ar.2017.2607

تعداد نشریات	32
تعداد شماره‌ها	840
تعداد مقالات	8,153
تعداد مشاهده مقاله	52,501,752
تعداد دریافت فایل اصل مقاله	8,895,825

	ایجاد درخت ژنی با استفاده از واگرایی کولبک-لیبلر روی ژن‌های موثر بر تولید شیر در گاو شیری
تحقیقات تولیدات دامی
مقاله 2، دوره 6، شماره 3، آذر 1396، صفحه 11-28 اصل مقاله (818.84 K)
نوع مقاله: مقاله پژوهشی
شناسه دیجیتال (DOI): 10.22124/ar.2017.2607
نویسندگان
هوشنگ دهقان زاده¹؛ سید ضیاء الدین میرحسینی^* ²؛ مصطفی قادری زفره‌یی³؛ حسن توکلی⁴؛ سعید اسماعیل خانیان⁵
¹دانشجوی دکتری گروه علوم دامی، دانشکده علوم کشاورزی، دانشگاه گیلان
²استاد گروه علوم دامی، دانشکده علوم کشاورزی، دانشگاه گیلان
³استادیار گروه علوم دامی، دانشکده علوم کشاورزی، دانشگاه یاسوج
⁴استادیار گروه مهندسی برق، دانشکده فنی، دانشگاه گیلان
⁵دانشیار موسسه تحقیقات علوم دامی کشور، سازمان تحقیقات، آموزش و ترویج کشاورزی، کرج
چکیده
نظریه اطلاعات، شاخهای از ریاضیات است که با مهندسی ارتباطات، زیستشناسی و پزشکی همپوشانی دارد. هدف از بررسی حاضر ارائه روشی جهت خوشهبندی تعدادی از ژنهای موثر روی تولید شیر در گاو شیری با استفاده از الگوریتمی متکی بر واگرایی کولبک - لیبلر بود. در این پژوهش بعد از استخراج توالی DNA ژن‌ و اگزون‌های موثر بر تولید شیر در گاو شیری، فراسنجه آنتروپی در مراتب یک تا چهار برای هر ژن و اگزون‌های هر ژن محاسبه شد. جهت استخراج فاصله میان ژن‌ها از یکدیگر، از واگرایی کولبک - لیبلر در سه روش مختلف استفاده شد. روش‌های اول و دوم مبتنی بر همترازی ولی روش سوم غیر مبتنی بر همترازی و بر پایه آنتروپی نسبی ژن‌ها بود. نتایج هر سه روش واگرایی کولبک - لیبلر روی توالی DNA ژن‌ها و اگزون‌ها با استفاده از هفت روش معمولSingle ،Complete ،Average ،Weighted ، Centroid، Medianو KMeansخوشه‌بندی شدند. تجمیع نتایج هر خوشه‌بندی که با الگوریتم AdaBoost انجام شد، و خود نوعی درخت ژنی را تداعی کرد، نشان داد که روش سوم، خوشه‌بندی معقولی را از نظر زیستی برای مجموعه‌ای از ژن‌ها حاصل نمود چرا که با نتایج حاشیهنویسی ژنومی ژن‌های حاصل ازGeneMANIA مطابقت داشت. این اعتقاد وجود دارد که روش ارائه شده برای ایجاد درخت ژنی می‌تواند با سایر روش‌های متکی بر توالی DNA ژن‌ها جهت خوشهبندی مجموعه‌ای از ژن‌ها، رقابت نماید و لذا می‌تواند در گروه‌بندی ژن‌های سایر گو‌نه‌ها نیز بکار رود.
کلیدواژه‌ها
تئوری اطلاعات؛ خوشه بندی ژن؛ گاو شیری؛ واگرایی کولبک‌لیبلر

مراجع
Buitenhuis A. J., Sundekilde U. K., Poulsen N., Bertram H. C., Larsen L. B. and Sørensen P. 2013. Estimation of genetic parameters and detection of QTL for metabolites in Danish Holstein milk. Journal of Dairy Science, 14(79): 1-10. Changchuan Y., Ying C. and Stephen Y. 2014. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering. Journal of Theoretical Biology, 359: 18–28. Clemente J. C., Satou K. and Valiente G. 2007. Phylogenetic reconstruction from non-genomic data. Bioinformatics, 23: 110–115. Edwards S. V., Fertil B., Giron A. and Deschavanne P.J. 2002. A genomic schism in birds revealed by phylogenetic analysis of DNA strings. System Biology, 51: 599-613. Erill I. 2012. Information Theory and biological sequences: Insights from an evolutionary prespective. 2012 Nova Science Publishers, Inc. Freund Y. and Schapire R. 1996. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55: 119. Freund Y. and Schapire R. 1996. Experiments with a new boosting algoritm. Paper read at Proceeding of the Thirteenth Internatioanal Conference on Machine Learning. Forst C. V. and Schulten K. 2001. Phylogenetic analysis of metabolic pathways. Journal of Molecular Evolution, 52: 471–489. Ghaderi-Zefrehei M., Bandi Dastjerdi A., Bahreini Behzadi A., Samadian F. and Meamar M. 2016. Investigation of information accumulation in Escherichia Coli's DNA sequence affecting mastitis in dairy cow using information theory. Journal of Ruminant Research, 4(2): 1-22. Gray R. M . 2013. Entropy and Information Theory. First Edition. Springer-Verlag New York publisher. Heymans M. and Singh A. K. 2003. Deriving phylogenetic trees from the similarity analysis of metabolic pathways. Bioinformatics, 19 (1): 138–146. Jiang S., Tang C., Zhang L. and Zhang A. 2014. A maximum entropy approach to classifying gene array data sets. Workshop on Data Mining for Genomics, First SIAM International Conference on Data Mining. Khatib H., Monson R. L., Schutzkus V., Kohl D. M., Rosa G. J. M. and Rutledge J. J. 2008. Mutations in the STAT5A gene are associated with embryonic survival and milk composition in cattle. Journal of Dairy Science, 91: 784–793. Kim J., Kim S., Lee K. and Kwon Y. 2009. Entropy analysis in yeast DNA. Chaos, Solitons and Fractals, 39: 1565–1571. Kullback S. and Leibler R. 1951. On information and sufficiency. The Annals of Mathematical Statistics, 22: 79–86. Lee L. 2009. Used kullback-Liebler measure as a new method for the reconstruction of the phylogenetic tree of the Cornavirus and SARS viruses. Lemay D. G., Lynn D. J., Martin W. F., Neville M. C., Casey T. M., Rincon G., Kriventseva E. V., Barris W. C., Hinrichs A. S., Molenaar A. J., Pollard K. S., Maqbool N. J., Singh K., Murney R., Zdobnov E. M., Tellam R. L., Medrano J. F., German J. B. and Rijnkels M. 2009. The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biology. 10:R43. Li C.and Wang J. 2005. Relative entropy of DNA andits application. Physica A, 347: 465–471. Liou C. Y., Tseng S. H., Cheng W. C. and Tsai H. Y. 2013. Structural complexity of DNA sequence. Computational and Mathematical Methods in Medicine, 2013: 1-11. Liu B. 2007. Uncertainty Theory, 2nd ed., Springer-Verlag, Berlin. Machado J. T. 2012. Shannon entropy analysis of the genome code. Mathematical Problems in Engineering, 2012:1-12. Monge R. E. and Crespo J. L. 2014. Comparison of Complexity Measures for DNA Sequence Analysis. 2014 International Work Conference on Bio-inspired Intelligence (IWOBI). Neagoe I. M., Popescu D. and Niculescu V. I. R. 2014. Applications of entropic divergence measures for DNA segmentation into high variable regiones of cryposporidium spp. GP60 gene. Romanian Reports in Physics, 66(4): 1078–1087. Pham T. D., Crane D. I., Tannock D. and Beck D. 2004. Kullback-Leibler dissimilarity of Markov models for phylogenetic tree reconstruction. Proceeding of 2004 international Symposium on Inteligent Multimedia, Video and Speech Processing. October 20-22, 2004 HongKong. Porto-DIaz L., BolOn-Canedo V., Alonso-Betanzos A. and Fontenla-Rome O. 2011. A study of performance on microarray data sets for a classifier based on information theoretic learning. Neural Networks, 24: 888-896. Qi J., Wang B. and Hao B. 2004. Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. Journal of Molecular Evolution, 58: 1-11. Ruiz-Marin M., Matilla-Garcia M., Cordoba J. A. G., Susillo-Gonzalez J. L., Romo-Astorga A., Gonzalez-Pérez A., Ruiz A. and Gayan J. 2010. An entrpyetest for single-locus genetic association analysis. BMC Genetics, 11: 19. Shannon C. 1948. A mathematical theory of communication. Bell System Technical Journal, 27: 379-423 and 623-656. Sherwin B. W. 2010. Entropy and Information Approaches to Genetic Diversity and its Expression: Genomic Geography Entropy, 12: 1765-1798. Stuart G. W., Moffet K. and Baker S. 2002. Integrated genespecies phylogenies from unaligned whole genomeprotein sequences. Bioinformatics, 18: 100-108. Stuart G. W., Moffet K. and Leader J. J. 2002. A comprehensivevertebrate phylogeny using vector representationsof protein sequences from whole genomes. Molecular Biology and Evolution, 19: 554-562. Sundekilde U. K., Larsen L. B. and Bertram H. C. 2013. NMR-Based Milk Metabolomics. Metabolites, 3:204-222. Tautz D., Trick M., Dover G. A. 1986. Cryptic simplicity in DNA is a major source of genetic variation. Nature, 322: 652–656. Vinga S.,AlmeidaJ.2003.Alignment-freesequencecomparison:review. Bioinformatics, 19 (4):513-523. Vinga S. 2013. Information theory applications for biological sequence analysis. Briefings in bioinformatics. 15 (3): 376-389. Warde-Farley D., Donaldson S. L., Comes, O., Zuberi K., Badrawi R., Chao P., Franz M., Grouios C., Kazi F., Lopes C. T., Maitland A., Mostafavi S., Montojo J., Shao Q., Wright G., Bader G. D. and Morris Q. 2010. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Research, 38, Web Server issue doi:10.1093/nar/gkq537. Xie X., Yu Y., Liu G., Yuan Z. and Song J. 2010. Complexity and entropy analysis of DNA methyltransferase. Journal of Data Mining in Genom Proteomics, 1(2): 100-105. Yu Z. G., Anh V. and Lau K. S. 2003. Multifractal and correlation analysis of protein sequences from complete genome, Physics Review E, 68: 021913. Yu Z. G., Anh V. V. and Zhou L. Q. 2005. Fractal and dynamical language methods to construct phylogenetic tree based on protein sequences from complete genomes, in L.Wang, K. Chen and Y.S. Ong (Eds): ICNC 2005, Lecture Notes in Computer Science, 3612: 337-347. Yu Z. G., Zhou L. Q., Anh V., Chu K. H. 2005. Phylogenyof prokaryotes and chloroplasts revealed by asimple composition approach on all protein sequencesfrom whole genome without sequence alignment. Journal ofMolecular Evolution, 60: 538-545. Zhang J. L., Zan L. S., Fang P., Zhang F., Shen G. L. and Tian W. Q. 2008. Genetic variation of PRLR gene and association with milk performance traits in dairy cattle. Canadian Journal of Animal Science, 88: 33-39. Zhou L. Q., Yu Z. G., Anh V., Nie P. R., Liao F. F. and Chen Y. J. 2007. Log-correlation distance and Fourier transformation with Kullback-Leibler divergence distance for construction of vertebrate phylogeny using complete mitochondrial genomes. In Proceedings of the 3nd International Conference on Natural Computation (ICNC2007), Haikou, China, August 2007; pp: 304–308.
آمار تعداد مشاهده مقاله: 1,457 تعداد دریافت فایل اصل مقاله: 961

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

پیوندهای مفید

آمار

ایجاد درخت ژنی با استفاده از واگرایی کولبک-لیبلر روی ژن‌های موثر بر تولید شیر در گاو شیری