ارائه روشی جهت افزایش اتکاپذیری حافظه‌ی نهان

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشیار، دانشگاه شهید باهنر کرمان، کرمان، ایران

2 کارشناسی ارشد، دانشگاه شهید باهنر کرمان، کرمان، ایران

چکیده

پردازنده‌های مدرن که شامل حافظه‌ی ‌نهان بزرگ است، نسبت به خطاهای گذرا آسیب‌پذیری بالایی دارد. اهمیت این موضوع باعث شده است تا از روش‌های کدگذاری برای محافظت در برابر خطا استفاده شود. هزینه‌ی قابلیت اطمینان باید به نحوی تأمین گردد تا استفاده‌ی بهینه از انرژی و ناحیه را در پی داشته باشد. در این مقاله روشی برای افزایش اتکاپذیری حافظه‌‌ی ‌نهان محافظت نشده در پردازنده‌ها ارائه ‌شده است. در این مقاله قابلیت اطمینان حافظه‌ی ‌نهان در پردازنده‌های مدرن با استفاده از مکانیسم کاهش خطای برچسب کم‌هزینه مورد مطالعه قرار می‌گیرد. روش پیشنهادی از تکنیک افزایش فاصله‌ی ‌همینگ در برچسب بهره‌برداری می‌کند. علاوه‌بر کاهش False Hit به صفر ، دارای سربار بسیار پایین نیز هست.

کلیدواژه‌ها


عنوان مقاله [English]

An Approach to Dependability Enhancement of Cache Memories

نویسندگان [English]

  • Mahdieh Ghazvini 1
  • mohammadjavad shahnavazi 2
  • Behnam Ghavami 1
1 Associate Professor, Shahid Bahonar University of Kerman, Kerman, Iran
2 Master's degree, Shahid Bahonar University of Kerman, Kerman, Iran
چکیده [English]

The modern processors that consist of large caches are very vulnerable to transient errors. Due to the importance of this problem, coding methods are adopted for protection against errors. The cost of reliability should be incurred to optimize the use of energy and area. This paper proposes an approach to dependability enhancement in unprotected caches of processors. For this purpose, the low-cost tag error mitigation mechanism is adopted to analyze the reliability of caches in modern processors. Benefiting from the tag Hamming distance increase technique, the proposed approach has a much lower overhead and decreases the false hit rate to zero.

کلیدواژه‌ها [English]

  • Hamming distance
  • reliability
  • false hit
  • cache memory
  • error

Smiley face

[1]     Lotfi, N. Saxena, R. Bramley, P. Racunas and P. Shirvani, "Low Overhead Tag Error Mitigation for GPU Architectures," 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Luxembourg City, 2018, pp. 314-321.
[2]     H. Wen and W. Zhang, "Heterogeneous Cache Hierarchy Management for Integrated CPU-GPU Architecture," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-6, doi: 10.1109/HPEC.2019.8916239.
[3]     R. Baumann, “The impact of technology scaling on soft error rate performance and limits to the efficacy of error correction,” in Proceedings of International Electron Devices Meeting, 2002.
[4]     C. Constantinescu, “Trends and challenges in VLSI circuit reliability,” IEEE Micro, vol. 23, no. 4, 2003.
[5]     A. Mahmoud, S. K. S. Hari, M. B. Sullivan, T. Tsai and S. W. Keckler, "Optimizing Software-Directed Instruction Replication for GPU Error Detection," SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX, USA, 2018, pp. 842-854, doi: 10.1109/SC.2018.00070.
[6]     R. W. Hamming, “Error detecting and error correcting codes,” The Bell System Technical Journal, vol. 29, no. 2, pp. 147–160, April 1950.
[7]     P. Reviriego, S. Pontarelli, M. Ottavi and J. A. Maestro, "FastTag: A Technique to Protect Cache Tags Against Soft Errors," in IEEE Transactions on Device and Materials Reliability, vol. 14, no. 3, pp. 935-937, Sept. 2014.
[8]     A. Gendler, A. Bramnik, A. Szapiro and Y. Sazeides, "Don’t Correct the Tags in a Cache, Just Check Their Hamming Distance from the Lookup Tag," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 571-582, doi: 10.1109/HPCA.2018.00055.
[9]     J. Hong and S. Kim, "Smart ECC Allocation Cache Utilizing Cache Data Space," in IEEE Transactions on Computers, vol. 66, no. 2, pp. 368-374, 1 Feb. 2017.
[10] H. Farbeh, L. Delshadtehrani, H. Kim and S. Kim, "ECC-United Cache: Maximizing Efficiency of Error Detection/Correction Codes in Associative Cache Memories," in IEEE Transactions on Computers, doi: 10.1109/TC.2020.2994067.
[11] S. Wang, J. Hu and S. G. Ziavras, "Replicating Tag Entries for Reliability Enhancement in Cache Tag Arrays," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, no. 4, pp. 643-654, April 2012.
[12] J. Hong, J. Kim and S. Kim, "Exploiting Same Tag Bits to Improve the Reliability of the Cache Memories," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 2, pp. 254-265, Feb. 2015.
[13] H. Farbeh, F. Mozafari, M. Zabihi and S. G. Miremadi, "RAW-Tag: Replicating in Altered Cache Ways for Correcting Multiple-Bit Errors in Tag Array," in IEEE Transactions on Dependable and Secure Computing, vol. 16, no. 4, pp. 651-664, 1 July-Aug. 2019.
[14] Antonio González, Mateo Valero, Nigel Topham, and Joan M. Parcerisa. Eliminating cache conflict misses through XOR-based placement functions. In Proceedings of the 11th international conference on Supercomputing (ICS ’97). Association for Computing Machinery, New York, NY, USA, 76–83. 1997. DOI:https://doi.org/10.1145/263580.263599
[15]  Zhao Zhang, Zhichun Zhu and Xiaodong Zhang, "A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality," Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000, Monterey, CA, USA, 2000, pp. 32-41.

[16] Ghazi Maghribi, Saeed, Alemi, Hadi. A new way to identify the blind of the initial state of the synchronous hash after the channel encoder. Electronic and Cyber Defense, 1400; 9 (1): 19-27.
[17] M. Kharbutli, Y. Solihin and Jaejin Lee, "Eliminating conflict misses using prime number-based cache indexing," in IEEE Transactions on Computers, vol. 54, no. 5, pp. 573-586, May 2005.
[18] R. Ubal, B. Jang, P. Mistry, D. Schaa and D. Kaeli, "Multi2Sim: A simulation framework for CPU-GPU computing," 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, MN, 2012, pp. 335-344.
[19] Y. Arafa, A. A. Badawy, G. Chennupati, N. Santhi and S. Eidenbenz, "PPT-GPU: Scalable GPU Performance Modeling," in IEEE Computer Architecture Letters, vol. 18, no. 1, pp. 55-58, 1 Jan.-June 2019, doi: 10.1109/LCA.2019.2904497.
[20] https://github.com/Multi2Sim/
[21] S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen and N. P. Jouppi, "McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures," 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2009, pp. 469-480
[22] C. W. Slayman, "Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations," in IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 397-404, Sept. 2005, doi: 10.1109/TDMR.2005.856487.
دوره 10، شماره 1 - شماره پیاپی 37
شماره پیاپی 37، فصلنامه بهار
خرداد 1401
صفحه 1-10
  • تاریخ دریافت: 11 بهمن 1399
  • تاریخ بازنگری: 10 مرداد 1400
  • تاریخ پذیرش: 20 آذر 1400
  • تاریخ انتشار: 01 خرداد 1401