ارائه یک معماری عامل گرا برای کاوش معنایی از داده‌های بزرگ مقیاس در محیط های توزیع شده

نوع مقاله : مقاله پژوهشی

نویسندگان

1 مربی دانشگاه جامع امام حسین(ع)

2 دانشیار دانشگاه علم و صنعت ایران

3 استاد دانشگاه جامع امام حسین(ع)

چکیده

داده­های بزرگ مقیاس، متشکل از داده­های حجیم، توزیع شده، پراکنده، ناهمگون و ترکیبی از داده­های نامتجانس، بی ربط، گمراه کننده، واقعی و غیر واقعی است. بنابراین تجزیه و تحلیل، ایجاد ارزش و بهره­وری از داده­ها، همواره چالشی مهم و باز محسوب می شود. بنابراین هدف این پژوهش ارائه یک معماری ائتلافی جدید برای تولید اطلاعات با ارزش برای تصمیم‌­گیری از میان انبوه داده­ها است. معماری پیشنهادی که به اختصار ASMLDE  نامیده می­شود، با هدف توسعه و بهبود داده­کاوی، کاوش معنایی و تولید قواعد سودمند و با کیفیت از چهار لایه، هفت مؤلفه و شش عامل اصلی تشکیل می­شود. در معماری پیشنهادی برای جمع‌آوری و استانداردسازی پردازش‌های کیفی و تفسیرهای پیچیده‌تر، از مفهوم‌سازی با فرآیند v ’s4،  بینش از حجم و مقیاس داده‌ها در قالب مدل V’s3 و درنهایت بینش کیفی مبتنی بر ضخامت داده‌ها استفاده‌ شده است. این معماری با حمایت هستان­شناسی و عامل­کاوی، فضاهای بزرگ کاوش را کوچک‌تر و سرعت و کیفیت عملیات داده­کاوی را به دلیل به‌کارگیری سامانه­های چند عاملی افزایش می­دهد. خودکار­سازی عملیات کاوش، کاهش پیچیدگی داده­­ها و فرآیندهای کسب‌وکار نیز از مهم‌ترین دستاوردهای معماری پیشنهادی است. به‌منظور ارزیابی معماری پیشنهادی، مجموعه داده­ای بزرگ مقیاس از دامنه حوادث طبیعی و کلاس هستان‌شناسی زمین لرزه از پایگاه دانش DBpedia مورد استفاده قرار گرفته است. نتایج ارزیابی که حاصل از کاوش قواعد معنایی روی مجموعه داده­ای ذکر شده است، اثربخشی و  قابلیت­های معماری ASMLDE را در افزایش کیفیت قواعد معنایی کاوش شده متناسب با نیاز کاربر و کوچک‌تر کردن فضای بزرگ داده­کاوی نسبت به سایر چارچوب­ها و معماری­های مشابه نشان می‌دهد.

کلیدواژه‌ها


عنوان مقاله [English]

Providing an Agent-Based Architecture for Semantic Mining From Large-Scale Data in Distributed Environments

نویسندگان [English]

  • hussein saberi 1
  • M. R. Kangavari 2
  • M. R. Hasani Ahangar 3
1 Imam Hussein comprehensive university
2 iust
3 IHU
چکیده [English]

Large-scale data may consist of big, distributed, scattered, heterogeneous, irrelevant, misleading, real, and unrealistic data or any combination of them. Therefore, analyzing, creating value and data productivity is always an        important and open challenge. Therefore, the purpose of this study is to present a new coalition architecture for      generating valuable information for decision making among the masses of data. The proposed architecture,             abbreviated ASMLDE, aims to develop and improve data mining and semantic exploration, and to produce useful and high-quality rules consisting of four layers, seven components and six key elements. In the proposed architecture,   conceptualization with 4v's process, insight into the volume and scale of data in the form of 3v's model and finally qualitative insight based on data thickness, are used for conceptualization and standardization of qualitative processes and more complex interpretations. This architecture, supported by ontology and agent mining, reduces large search spaces and increases the speed and quality of data mining operations due to the use of multi-agent systems.             Automating exploration operations, reducing data complexity and business processes are also important achievements of the proposed architecture. To evaluate the proposed architecture, a large-scale dataset of natural disasters and earthquake ontology classes from the DBpedia knowledge base have been used. The evaluation results obtained by exploring the semantic rules of the mentioned dataset highlight the effectiveness and capabilities of the ASMLDE        architecture in enhancing the quality of the semantic rules explored to fit the user need and reducing the large data mining space over other similar frameworks and architectures.
 

کلیدواژه‌ها [English]

  • Large Scale Data
  • Semantic Mining
  • Ontology
  • Agent-Oriented Architecture  
[1]      H. Zhuge, "The Complex Link," arXiv preprint arXiv:1805.00434, vol. abs/1805.00434, 2018. [Online]. Available: http://arxiv.org/abs/1805.00434.##
[2]     A. K. Bhadani and D. Jothimani, “Big Data: Challenges, Opportunities, and Realities,” In Effective Big Data Management and Opportunities for Implementation: IGI Global, pp. 1-24, 2016.##
[3]     E. Belghache, J.-P. Georgé, and M.-P. Gleizes, “Towards an Adaptive Multi-agent System for Dynamic Big Data Analytics,” In 2016 Intl. IEEE Conf. on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), IEEE, pp. 753-758, 2016.##
[4]     H. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R. Ramakrishnan, and C. J. C. o. t. A. Shahabi, “Big data and its technical challenges,” vol. 57, no. 7, pp. 86-94, 2014.##
[5]     R. M. Gahar, O. Arfaoui, M. S. Hidri, and N. B. J. P. C. S. Hadj-Alouane, “An Ontology-driven Map Reduce Framework for Association Rules Mining in Massive Data,” vol. 126, pp. 224-233, 2018.##
[6]     B. Eine, M. Jurisch, and W. Quint, "Ontology-based big data management," Systems, vol. 5, no. 3, p. 45, 2017.##
[7]     P. V. Bhagat and P. M. Gourshettiwar, "A survey paper on ontology-based approaches for semantic data mining," International Journal on Recent and Innovation Trends in Computing and Communication, vol. 3, no. 4, pp. 2137-2141, 2015.##
[8]     M. R. Chikhale, “Study of Distributed Data Mining Algorithm and Trends,” IOSR-JCE, pp. 41-47, 2016. [Online]. Available: www.iosrjournals.org.##
[9]     D. Dou, H. Wang, and H. Liu, “Semantic Data Mining: A Survey of Ontology-based Approaches,” In Proc. of the 2015 IEEE 9th Int. Conf. on Semantic Computing (IEEE ICSC 2015), 2015: IEEE, pp. 244-251, 2015.##
[10]  V. S. Ms and V. S. Ms and K. Shah, "Performance evaluation of distributed association rule mining algorithms," Procedia Computer Science, vol. 79, pp. 127-134, 2016.##
[11]  T. Hansmann and P. Niemeyer, “Big Data-characterizing an Emerging Research Field Using Topic Models,” In Proc. of the 2014 IEEE/WIC/ACM Int. Joint Conf. on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Vol. 01, IEEE Computer Society, pp. 43-53, 2014.##
[12]  G. S. Bhamra, A. K. Verma, and R. B. Patel, "Agent Based Frameworks for Distributed Association Rule Mining: An Analysis," International Journal in Foundations of Computer Science & Technology (IJFCST), vol. 5, no. 1, pp. 11-22, 2015.##
[13]  G. S. Bhamra, A. Verma, and R. Patel, "A framework for association rule mining of distributed data," 2015.##  
[14]  S. Urmela and M. Nandhini, "Approaches and Techniques of Distributed Data Mining: A Comprehensive Study," International Journal of Engineering and Technology (IJET), vol. 9, no. 1, p. 69, 2017.##
[15]  A. G. Touzi, H. B. Massoud, and A. Ayadi, "Automatic ontology generation for data mining using FCA and clustering," arXiv preprint arXiv:1311.1764, 2013.##
[16]  V. Bhatnagar and S. Srinivasa, Big Data Analytics: Second International Conference, BDA 2013, Mysore, India, December 16-18, 2013,Proceedings. Springer, 2013, vol. 8302.##
[17]  S. Srinivasa and S. Mehta, Big Data Analytics: Third International Conference, BDA 2014, New Delhi, India, December 20-23, 2014. Proceedings. Springer, 2015, vol. 8883.##
[18]  J. M. Kanter and K. Veeramachaneni, “Deep feature synthesis: Towards automating data science endeavors,” in Data Science and Advanced  Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on. IEEE, 2015, pp. 1–10. S. Urmela and M. Nandhini, "Approaches and Techniques of Distributed Data Mining: A Comprehensive Study," International Journal of Engineering and Technology (IJET), vol. 9, no. 1, p. 69, 2017.##
[19]  W. Fan and A. Bifet, "Mining big data: current status, and forecast to the future," ACM SIGKDD explorations newsletter, vol. 14, no. 2, pp. 1-5, 2013.##
[20]  H. M. Safhi, B. Frikh, and B. Ouhbi, "Assessing reliability of Big Data Knowledge Discovery process," Procedia computer science, vol. 148, pp. 30-36, 2019.##
[21]  S. Jain and V. Meyer, "Evaluation and refinement of emergency situation ontology," Int J Inform Educ Technol, vol. 8, no. 10, pp. 713-719, 2018.##
[22]   S. Nadal, O. Romero, A. Abelló, P. Vassiliadis, and S. S. Nadal, O. Romero, A. Abelló, P. Vassiliadis, and S. Vansummeren, "An integration-oriented ontology to govern evolution in big data ecosystems," Information systems, vol. 79, pp. 3-19, 2019.##
[23]  B. Jadhav Kalyani, S. Tamhane Manisha, U. Surwase Sonali, and A. P. P. Patil, "A new Approach for Frequent Itemset Data Mining in Hadoop Environment," 2017.##
[24]  W. Gan, J. C. W. Lin, H. C. Chao, and J. Zhan, "Data mining in distributed environment: a survey," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 6, p. e1216, 2017.##
[25]  M. Barati, Q. Bai, and Q. Liu, "Mining semantic association rules from RDF data," Knowledge-Based Systems, vol. 133, pp. 183-196, 2017.##
[26]  S. G. Atal and P. Chatur, "Large scale ontology for semantic web using clustering method over Hadoop," in 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB) , pp. 632-636, IEEE: 2016.##
[27]  D. A. N. T.Bharathi, “Enhanced Way of Association Rule Mining with Ontology,” Int. J. of Engineering and Computer Science, vol. 05, no. 10, pp. 18363-18371, 2016.##
[28]  A. Ogunde, O. Folorunso, and A. Sodiya, "A partition enhanced mining algorithm for distributed association rule mining systems," Egyptian Informatics Journal, vol. 16, no. 3, pp. 297-307, 2015.##
[29]  S. Patil, S. Karnik, and V. Sawant, "A Review on Multi-Agent Data Mining Systems," International Journal of Computer Science and Information Technologies, vol. 6, no. 6, pp. 4888-4893, 2015.##
[30]  J. Raad and C. Cruz, "A survey on ontology evaluation methods," 2015.##
[31]  H. Hlomani and D. Stacey, "Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: A survey," Semantic Web Journal, vol. 1, no. 5, pp. 1-11, 2014.##
[32]  C. J. Thompson, "The ‘big data’myth and the pitfalls of ‘thick data’opportunism: on the need for a different ontology of markets and consumption," Journal of Marketing Management, vol. 35, no. 3-4, pp. 207-230, 2019.##
[33]  Y. Y. Ang, "Integrating Big Data and Thick Data to Transform Public Services Delivery," 2019.##