Innovative Personal Assistance: Speech Recognition and NLP-Driven Robot Prototype

Main Article Content

Michelle Valerie
Irma Salamah
Lindawati

Keywords

Personal Assistance, Text Conversion, Speech Recognition, Natural Language Processing, Google Drive

Abstract

This paper presents the development and evaluation of a personal assistant robot prototype with advanced speech recognition and natural language processing (NLP) capabilities. Powered by a Raspberry Pi microprocessor, it is the core component of the robot's hardware. It is designed to receive commands and promptly respond by performing the requested actions, utilizing integrated speech recognition and NLP technologies. The prototype aims to enhance meeting efficiency and productivity through audio-to-text conversion and high-quality image capture. Results show excellent performance, with accuracy rates of 100% in Indonesian and 99% in English. The efficient processing speed, averaging 9.07 seconds per minute in Indonesian and 15.3 seconds per minute in English, further enhances the robot's functionality. Additionally, integrating a high-resolution webcam enables high-quality image capture at 1280 x 720 pixels. Real-time integration with Google Drive ensures secure storage and seamless data management. The findings highlight the prototype's effectiveness in facilitating smooth interactions and effective communication, leveraging NLP for intelligent language understanding. Integrating NLP-based speech recognition, visual documentation, and data transfer provides a comprehensive platform for managing audio, text, and image data. The personal assistant robot prototype presented in this research represents a significant advancement in human-robot interaction, particularly in meeting and collaborative work settings. Further refinements in NLP can enhance efficiency and foster seamless human-robot interaction experiences.

References

H. Widyantara, M. A. Afandi, R. Akseptori, and U. Umar, “Jurnal Nasional Teknik Elektro Navigation and Formation of Swarm Robotics with Local Positioning System,” vol. 3, pp. 2–7, 2022.

A. de Barcelos Silva et al., “Intelligent personal assistants: A systematic literature review,” Expert Syst. Appl., vol. 147, p. 113193, 2020, doi: 10.1016/j.eswa.2020.113193.

A. Tur and D. Traum, “Comparing Approaches to Language Understanding for Human-Robot Dialogue: An Error Taxonomy and Analysis,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 5813–5820.

M. Omar, S. Choi, D. Nyang, and D. Mohaisen, “Robust natural language processing: Recent advances, challenges, and future directions,” IEEE Access, 2022.

L. Riccioli, “Artificial Intelligence: Innovation for Society 5.0,” Available SSRN 4457016, 2023.

C. Lakshmi Chandana et al., “Voice-Enabled Virtual Assistant,” in Sustainable Communication Networks and Application: Proceedings of ICSCN 2021, Springer, 2022, pp. 335–346.

B. G. Mark, E. Rauch, and D. T. Matt, “Worker assistance systems in manufacturing: A review of the state of the art and future directions,” J. Manuf. Syst., vol. 59, pp. 228–250, 2021.

A. Michalsen et al., “Interprofessional Shared Decision-Making in the ICU: A Systematic Review and Recommendations From an Expert Panel*,” Crit. Care Med., vol. 47, no. 9, 2019, [Online]. Available: https://journals.lww.com/ccmjournal/Fulltext/2019/09000/Interprofessional_Shared_Decision_Making_in_the.13.aspx

A. J. Farringer and S. M. Manchak, “Communication and collaboration in a drug court team.,” Psychol. Serv., p. No Pagination Specified-No Pagination Specified, 2022, doi: 10.1037/ser0000735.

A. C. Khumalo and B. T. Kane, “Perspectives on record-keeping practices in MDT meetings and meeting record utility,” Int. J. Med. Inform., vol. 161, p. 104711, 2022.

W. Villegas-Ch, R. Amores-Falconi, and E. Coronel-Silva, “Design Proposal for a Virtual Shopping Assistant for People with Vision Problems Applying Artificial Intelligence Techniques,” Big Data Cogn. Comput., vol. 7, no. 2, p. 96, 2023.

G. Dilip et al., “Artificial intelligence-based smart comrade robot for elders healthcare with strait rescue system,” J. Healthc. Eng., vol. 2022, 2022.

P. Abtahi et al., “Understanding physical practices and the role of technology in manual self-tracking,” Proc. ACM Interactive, Mobile, Wearable Ubiquitous Technol., vol. 4, no. 4, pp. 1–24, 2020.

A. El-Komy, O. R. Shahin, R. M. Abd El-Aziz, and A. I. Taloba, “Integration of computer vision and natural language processing in multimedia robotics application,” Inf. Sci, vol. 7, no. 6, 2022.

L. Seero, J. Burge, A. M. Soria, and A. Van Der Hoek, “Exploring a Research Agenda for Design Knowledge Capture in Meetings,” in 2023 IEEE/ACM 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE), 2023, pp. 37–42.

J. A. W. Rainey, “Designing digital qualitative research workflows: enabling stakeholder participation across all research stages.” Newcastle University, 2021.

N. Chervyakov, M. Babenko, A. Tchernykh, N. Kucherov, V. Miranda-López, and J. M. Cortés-Mendoza, “AR-RRNS: Configurable reliable distributed data storage systems for Internet of Things to ensure security,” Futur. Gener. Comput. Syst., vol. 92, pp. 1080–1092, 2019.

S. Jacques, A. Ouahabi, and T. Lequeu, “Remote knowledge acquisition and assessment during the COVID-19 pandemic,” Int. J. Eng. Pedagog., vol. 10, 2020.

J. Kurjenniemi and N. Ryti, “Designing remote employee experience in knowledge work to attract talent,” 2020.

V. Ravindran, R. Ponraj, C. Krishnakumar, S. Ragunathan, V. Ramkumar, and K. Swaminathan, “IoT-Based Smart Transformer Monitoring System with Raspberry Pi,” in 2021 Innovations in Power and Advanced Computing Technologies (i-PACT), 2021, pp. 1–7.

B. Sudharsan, S. P. Kumar, and R. Dhakshinamurthy, “Ai vision: Smart speaker design and implementation with object detection custom skill and advanced voice interaction capability,” in 2019 11th International Conference on Advanced Computing (ICoAC), 2019, pp. 97–102.

R. Martinek, J. Vanus, J. Nedoma, M. Fridrich, J. Frnda, and A. Kawala-Sterniuk, “Voice communication in noisy environments in a smart house using hybrid LMS+ ICA algorithm,” Sensors, vol. 20, no. 21, p. 6022, 2020.

V. S. Abhijith and A. A. B. Raj, “Robot operating system based charging pad detection for multirotors,” in 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), 2020, pp. 1151–1155.

R. Beniwal, S. Patidar, R. Tomar, Shekhar, and R. Khatta, “Comparative Study of Cooling Solutions of a Drone Based on Raspberry Pi Deducing the Most Efficient Cooling Method,” in Computer Networks and Inventive Communication Technologies: Proceedings of Third ICCNCT 2020, 2021, pp. 269–281.

T. Yue et al., “A contact-triggered adaptive soft suction cup,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 3600–3607, 2022.

M. A. Afandi, I. Hikmah, and C. Agustinah, “Microcontroller-based Artificial Lighting to Help Growth the Seedling Pakcoy,” J. Nas. Tek. Elektro, vol. 10, no. 3, 2021, doi: 10.25077/jnte.v10n3.943.2021.

S.-Y. Zhang et al., “Molecule-based nonlinear optical switch with highly tunable on-off temperature using a dual solid solution approach,” Nat. Commun., vol. 11, no. 1, p. 2752, 2020.

Y. Liang et al., “A review of rechargeable batteries for portable electronic devices,” InfoMat, vol. 1, no. 1, pp. 6–32, 2019.

E. Billing, J. Rosén, and M. Lamb, “Language models for human-robot interaction,” in ACM/IEEE International Conference on Human-Robot Interaction, March 13–16, 2023, Stockholm, Sweden, 2023, pp. 905–906.

S. Hawi, J. Alhozami, R. AlQahtani, D. AlSafran, M. Alqarni, and L. El Sahmarany, “Automatic Parkinson’s disease detection based on the combination of long-term acoustic features and Mel frequency cepstral coefficients (MFCC),” Biomed. Signal Process. Control, vol. 78, p. 104013, 2022.

Z. E. Fitri, A. Baskara, A. Madjid, and A. M. N. Imron, “Comparison of Classification for Grading Red Dragon Fruit (Hylocereus Costaricensis),” J. Nas. Tek. Elektro, vol. 11, no. 1, pp. 43–49, 2022, doi: 10.25077/jnte.v11n1.899.2022.

A. Koduru, H. B. Valiveti, and A. K. Budati, “Feature extraction algorithms to improve the speech emotion recognition rate,” Int. J. Speech Technol., vol. 23, no. 1, pp. 45–55, 2020.