파이썬 음성인식 프로그램

Python을 이용해 음성 인식으로 명령어를 실행하는 프로그램

= 음성 인식 명령어 실행 프로그램

Python의 SpeechRecognition 라이브러리 사용하여 음성을 텍스트로 변환
subprocess 모듈을 사용하여 특정 명령어를 실행

1단계: 필요한 라이브러리 설치

명령 프롬프트나 터미널에서 다음 명령어 실행
```
pip install SpeechRecognition PyAudio
```
- SpeechRecognition: 음성 인식을 위한 핵심 라이브러리입니다. Google Web Speech API를 포함한 다양한 음성 인식 엔진을 지원합니다.
- PyAudio: 마이크 입력을 처리하기 위해 SpeechRecognition이 필요로 하는 라이브러리

2단계: 코드 작성

import speech_recognition as sr
import subprocess
import platform

def recognize_speech_from_mic(recognizer, microphone):
    """
    마이크로부터 음성을 인식합니다.
    """
    with microphone as source:
        recognizer.adjust_for_ambient_noise(source) # 주변 소음 조정
        print("말씀해주세요!")
        audio = recognizer.listen(source)

    response = {
        "success": True,
        "error": None,
        "transcription": None
    }

    try:
        response["transcription"] = recognizer.recognize_google(audio, language="ko-KR") # 한국어 음성 인식
    except sr.RequestError:
        # API에 연결할 수 없는 경우
        response["success"] = False
        response["error"] = "API에 연결할 수 없습니다."
    except sr.UnknownValueError:
        # 음성을 이해할 수 없는 경우
        response["error"] = "음성을 이해할 수 없습니다."

    return response

def execute_command(command_text):
    """
    인식된 텍스트에 따라 명령어를 실행합니다.
    """
    command_text = command_text.lower() # 소문자로 변환하여 비교 용이하게

    if "메모장 열기" in command_text:
        print("메모장을 엽니다.")
        if platform.system() == "Windows":
            subprocess.Popen(["notepad.exe"])
        elif platform.system() == "Darwin": # macOS
            subprocess.Popen(["open", "-a", "TextEdit"])
        else: # Linux
            subprocess.Popen(["gedit"])
    elif "계산기 열기" in command_text:
        print("계산기를 엽니다.")
        if platform.system() == "Windows":
            subprocess.Popen(["calc.exe"])
        elif platform.system() == "Darwin":
            subprocess.Popen(["open", "-a", "Calculator"])
        else:
            subprocess.Popen(["gnome-calculator"])
    elif "안녕" in command_text:
        print("안녕하세요! 무엇을 도와드릴까요?")
    elif "종료" in command_text:
        print("프로그램을 종료합니다.")
        return False
    else:
        print(f"'{command_text}' 명령어는 알 수 없습니다.")
    return True

if __name__ == "__main__":
    recognizer = sr.Recognizer()
    microphone = sr.Microphone()

    print("음성 인식 프로그램이 시작되었습니다. '종료'라고 말하면 종료됩니다.")

    running = True
    while running:
        speech = recognize_speech_from_mic(recognizer, microphone)

        if speech["success"]:
            print(f"인식된 명령어: {speech['transcription']}")
            running = execute_command(speech["transcription"])
        else:
            print(f"오류: {speech['error']}")

        print("-" * 30)

{{결론 |내용=* 코드 설명

import 문:
speech_recognition as sr: 음성 인식을 위한 라이브러리입니다.
subprocess: 외부 프로그램을 실행하기 위한 모듈입니다.
platform: 현재 운영체제를 확인하기 위해 사용됩니다.
recognize_speech_from_mic(recognizer, microphone) 함수:
마이크로부터 음성을 받아들이고 텍스트로 변환하는 역할을 합니다.
recognizer.adjust_for_ambient_noise(source): 마이크가 주변 소음을 파악하여 인식률을 높여줍니다.
recognizer.listen(source): 사용자의 음성 입력을 기다립니다.
recognizer.recognize_google(audio, language="ko-KR"): Google Web Speech API를 사용하여 음성을 텍스트로 변환합니다. language="ko-KR"로 한국어를 지정합니다.
try-except 블록을 사용하여 API 연결 오류나 음성 인식 실패를 처리합니다.
execute_command(command_text) 함수:
인식된 텍스트(command_text)를 기반으로 특정 명령어를 실행합니다.
command_text.lower(): 대소문자 구분 없이 명령어를 처리하기 위해 소문자로 변환합니다.
if "메모장 열기" in command_text:와 같이 특정 키워드가 포함되어 있는지 확인합니다.
subprocess.Popen([]): 이 함수를 사용하여 운영체제에 맞는 프로그램을 실행합니다.
Windows: notepad.exe, calc.exe
macOS: open -a TextEdit, open -a Calculator
Linux: gedit, gnome-calculator (설치되어 있어야 함)
"종료" 명령어가 인식되면 False를 반환하여 메인 루프를 종료합니다.
메인 실행 블록 (if __name__ == "__main__":):
sr.Recognizer()와 sr.Microphone() 객체를 생성합니다.
무한 루프(while running:)를 통해 지속적으로 음성을 인식하고 명령어를 실행합니다.
recognize_speech_from_mic 함수를 호출하여 음성을 인식하고, 결과에 따라 execute_command를 호출합니다.

실행 방법

위 코드를 voice_command.py와 같은 이름으로 저장합니다.
터미널이나 명령 프롬프트에서 해당 파일이 있는 디렉토리로 이동합니다.
다음 명령어를 실행합니다:

   python voice_command.py

"말씀해주세요!"라는 메시지가 뜨면 마이크에 대고 "메모장 열기", "계산기 열기", "안녕", "종료" 등의 명령어를 말해보세요.

추가 기능 및 개선 사항 (고급)

더 많은 명령어 추가: execute_command 함수에 웹사이트 열기, 파일 탐색기 열기 등 다양한 명령어를 추가할 수 있습니다.
음성 비서 이름 설정: "헤이 구글"처럼 특정 키워드를 말해야만 반응하도록 설정할 수 있습니다. (예: "컴퓨터야 메모장 열어줘")
GUI 구현: PyQT, Tkinter 등의 라이브러리를 사용하여 시각적인 인터페이스를 제공할 수 있습니다.
오프라인 음성 인식: Vosk, CMU Sphinx와 같은 오프라인 음성 인식 엔진을 사용하여 인터넷 연결 없이도 동작하도록 만들 수 있습니다.
사용자 정의 명령어 학습: 특정 패턴을 학습하여 사용자가 자신만의 명령어를 등록할 수 있도록 만들 수 있습니다.

Comments