Yeyupiaoling

Tank Battle Controlled by Voice Commands

2023-12-17 206 views 语音 Pytorch Speech Recognition Artificial Intelligence Pytorch Voice Command

This paper introduces the program development process for controlling the Tank Battle game through voice commands, including steps such as environment setup, game startup, and instruction model fine-tuning. First, the project uses Anaconda 3, Windows 11, Python 3.11, and corresponding libraries for development. Users can adjust parameters in `main.py` such as recording time and data length, add new commands in `instruct.txt`, and write processing functions to start the game. Secondly, `record_data.py` is run to record command audio and generate training

Run Large Language Model Service with One Click and Build a Chat Application

2023-10-23 203 views Pytorch 深度学习 language model Artificial Intelligence Natural Language Processing Large Language Model (LLM)

This article introduces a method to build a local large language model chat service based on the Qwen-7B-Int4 model. First, you need to install the GPU version of PyTorch and other dependency libraries. Then, execute `server.py` in the terminal to start the service. The service supports Windows and Linux systems and can run smoothly with a low VRAM requirement (8G graphics card). In addition, an Android application source code is also provided. By modifying the service address and opening the `AndroidClient` file with Android Studio...

Easily and Quickly Set Up a Local Speech Synthesis Service

2023-10-22 208 views 语音 Pytorch Deep Learning Pytorch Speech Synthesis

This article introduces a method to quickly set up a local speech synthesis service using the VITS model architecture. First, you need to install the PyTorch environment and related dependency libraries. To start the service, simply run the `server.py` program. Additionally, the source code for an Android application is provided, which requires modifying the server address to connect to your local service. At the end of the article, a QR code is provided to join a knowledge planet and obtain the complete source code. The entire process is simple and efficient, and the service can run without an internet connection.

Real-time Speech Recognition Service with Remarkably High Recognition Accuracy

2023-10-21 189 views 语音 Pytorch Speech Recognition Artificial Intelligence

This article introduces the installation, configuration, and application deployment of the FunASR speech recognition framework. First, PyTorch and related dependency libraries need to be installed. For the CPU version, it can be completed using the command `conda install pytorch torchvision torchaudio cpuonly -c pytorch`; for the GPU version, use `conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c p` (Note: The original command may be truncated, and the complete command should be checked for accuracy).

FunASR Speech Recognition GUI Application

2023-10-08 241 views 语音 Pytorch Speech Recognition Artificial Intelligence FunASR Pytorch

This paper introduces a speech recognition GUI application developed based on FunASR, which supports recognition of local audio and video files as well as audio recording recognition. The application includes short audio recognition, long audio recognition (with and without timestamps), and audio file playback. The installation environment requires dependencies such as PyTorch (CPU/GPU), FFmpeg, and pyaudio. To use the application, execute `main.py`. The interface provides four options: short speech recognition, long speech recognition, recording recognition, and playback functionality. Among them, long speech recognition is divided into two models: one for concatenated output and another for explicit

Voiceprint Recognition System Implemented Based on PyTorch

2023-08-20 501 views 语音 Pytorch 深度学习 Pytorch Artificial Intelligence Python Voiceprint Recognition Deep Learning

This project provides an implementation of voice recognition based on PaddlePaddle, mainly using the EcapaTDNN model, and integrates functions of speech recognition and voiceprint recognition. Below, I will summarize the project structure, functions, and how to use these functions. ## Project Structure ### Directory Structure ``` VoiceprintRecognition-PaddlePaddle/ ├── docs/ # Documentation │ └── README.md # Project description document ```

Voiceprint Recognition System Based on PaddlePaddle

2023-08-20 271 views 语音 PaddlePaddle 深度学习 PaddlePaddle Artificial Intelligence Voiceprint Recognition Deep Learning

This project demonstrates how to use PaddlePaddle for speaker recognition (voiceprint recognition), covering the complete workflow from data preparation, model training to practical application. The project has a clear structure and detailed code comments, making it suitable for learning and reference. Below are supplementary explanations for some key points mentioned: ### 1. Environment Configuration Ensure you have installed the necessary dependency libraries. If using the TensorFlow or PyTorch version, please configure the environment according to the corresponding tutorials. ### 2. Data Preparation The `data`

Fine-tuning Whisper Speech Recognition Model and Accelerating Inference

2023-04-23 321 views 语音 Pytorch whisper Pytorch Deep Learning Speech Recognition Lora

Thank you for providing the detailed project description. To help more people understand and use your project, I will summarize and optimize some key information and steps: ### Project Overview This project aims to deploy a fine-tuned Whisper model to Windows desktop applications, Android APKs, and web platforms to achieve speech-to-text functionality. ### Main Steps #### Model Format Conversion 1. Clone the Whisper native code repository: ```bash git clone https://git

Segmenting Long Speech into Multiple Short Segments Using Voice Activity Detection (VAD)

2022-11-23 308 views 语音深度学习 Python PaddlePaddle Speech Recognition Artificial Intelligence

This paper introduces YeAudio, a voice activity detection (VAD) tool implemented based on deep learning. The installation command for the library is `python -m pip install yeaudio -i https://pypi.tuna.tsinghua.edu.cn/simple -U`, and the following code snippet can be used for speech segmentation: ```python from yeaaudio.audio import AudioSegment audio_seg ``` (Note: The original code snippet appears incomplete in the user's input; the translation preserves the partial code as provided.)

Training a Chinese Punctuation Model Based on PaddlePaddle

2022-09-14 257 views PaddlePaddle PaddlePaddle Deep Learning Artificial Intelligence Natural Language Processing Speech Recognition

This project provides a complete process to train and use a model for adding punctuation marks to Chinese text. Below is a summary of the entire process: 1. **Environment Preparation**: - Ensure necessary libraries are installed, such as `paddlepaddle-gpu` and `PaddleNLP`. - Configure the training dataset. 2. **Data Processing and Preprocessing**: - Tokenize the input text and label the punctuation marks. - Create splits for training, validation, and test sets. 3.

Speech Emotion Recognition Based on PyTorch

2022-07-07 271 views Pytorch 语音深度学习 Pytorch Speech Recognition Deep Learning Speech Classification Emotion Recognition

This project provides a detailed introduction to how to perform emotion classification from audio using PyTorch, covering the entire process from data preparation, model training to prediction. Below, I will give more detailed explanations for each step and provide some improvement suggestions and precautions. ### 1. Environment Setup Ensure you have installed the necessary Python libraries: ```bash pip install torch torchvision torchaudio numpy matplotlib seaborn soundf ```

Speech Emotion Recognition Based on PaddlePaddle

2022-07-06 183 views PaddlePaddle 语音 PaddlePaddle Speech Recognition Artificial Intelligence

The content you provided describes the training and prediction process for a speech classification task based on PaddlePaddle. Next, I will provide a more detailed and complete code example, along with explanations of the functionality of each part. ### 1. Environment Preparation Ensure that the necessary dependency libraries are installed, including `paddle` (specifically the PaddlePickle version). You can install them using the following command: ```bash pip install paddlepaddle==2.4.1 ``` ### 2. Code Implementation

Easily Implement Speech Synthesis with PaddlePaddle

2022-07-06 200 views PaddlePaddle 语音 Speech Recognition Artificial Intelligence Speech Synthesis PaddlePaddle

This paper introduces the implementation method of speech synthesis using PaddlePaddle, including simple code examples, GUI interface operations, and Flask web interfaces. First, a simple program is used to achieve the basic text-to-speech function, utilizing acoustic model and vocoder model to complete the synthesis process and save the result as an audio file. Secondly, the `gui.py` interface program is introduced to simplify the user operation experience. Finally, the Flask web service provided by `server.py` is demonstrated, which can be called by Android applications or mini-programs to achieve remote speech...

Building an Animal Recognition System with PaddlePaddle to Identify Thousands of Animal Species

2022-07-06 221 views 深度学习 PaddlePaddle PaddlePaddle Artificial Intelligence Animal Recognition Image Recognition

This paper introduces a project for animal recognition using PaddlePaddle. Firstly, the animal recognition task can be completed with just a few lines of code. Secondly, a GUI interface is provided to facilitate users in uploading images for recognition. Finally, a Flask web interface is supported for Android calls, enabling cross - platform application. The project includes details such as model path, image reading, and prediction result output, and running screenshots are attached to demonstrate the implementation effect.

ECAPa-TDNN Voiceprint Recognition Model Implemented with PyTorch

2022-05-04 210 views 语音 Pytorch Deep Learning Artificial Intelligence Voiceprint Recognition Pytorch EcapaTdnn

This project demonstrates how to implement speech recognition functionality using PaddlePaddle, specifically including voiceprint comparison and voiceprint registration. Below is a summary of the main content and some improvement suggestions: ### 1. Project Structure and Functions - **Voiceprint Comparison**: Compare the voice features of two audio files to determine if they are from the same person. - **Voiceprint Registration**: Store the voice data of new users in a database and generate corresponding user information. ### 2. Technology Stack - Use PaddlePaddle for model training and prediction.

ECAPa-TDNN Speaker Recognition Model Implemented Based on PaddlePaddle

2022-05-01 264 views PaddlePaddle 语音 PaddlePaddle Deep Learning Python Voiceprint Recognition Artificial Intelligence

This project is a voiceprint recognition system based on PaddlePaddle. It covers application scenarios from data preprocessing, model training to voiceprint recognition and comparison, and is suitable for practical applications such as voiceprint login. Here is a detailed analysis of the project: ### 1. Environment Preparation and Dependency Installation First, ensure that PaddlePaddle and other dependent libraries such as `numpy`, `matplotlib`, etc., have been installed. They can be installed using the following command: ```bash pip install paddlepaddle ```

Adding Punctuation Marks to Speech Recognition Text

2022-01-13 293 views PaddlePaddle 深度学习 Python Deep Learning PaddlePaddle Speech Recognition Natural Language Processing

This paper introduces a method for adding punctuation marks to speech recognition text according to grammar, mainly divided into four steps: downloading and decompressing the model, installing PaddleNLP and PPASR tools, importing the PunctuationPredictor class, and using this class to automatically add punctuation marks to the text. The specific steps are as follows: 1. Download the model and decompress it into the `models/` directory. 2. Install the relevant libraries of PaddleNLP and PPASR. 3. Instantiate the predictor using the `PunctuationPredictor` class and pass in the pre

PPASR Streaming and Non-Streaming Speech Recognition

2021-11-30 263 views PaddlePaddle 语音深度学习 Artificial Intelligence Deep Learning PaddlePaddle Speech Recognition DeepSpeech2

This document introduces how to deploy and test a speech recognition model implemented using PaddlePaddle, and provides various methods to execute and demonstrate the model's functionality. The following is a summary and interpretation of the document content: ### 1. Introduction - Provides an overview of PaddlePaddle-based speech recognition models, including recognition for short voice segments and long audio clips. ### 2. Deployment Methods #### 2.1 Command-line Deployment Two commands are provided to implement different deployment methods: - `python infer_server.

Processing and Usage of the WenetSpeech Dataset

2021-11-30 294 views 语音 PaddlePaddle 深度学习 Speech Recognition PaddlePaddle WenetSpeech Mandarin Speech Dataset Chinese Speech Dataset

The WenetSpeech dataset provides over 10,000 hours of Mandarin Chinese speech, categorized into strong-labeled (10,005 hours), weak-labeled (2,478 hours), and unlabeled (9,952 hours) subsets, suitable for supervised, semi-supervised, or unsupervised training. The data is grouped by domain and style, and datasets of different scales (S, M, L) as well as evaluation/test data are provided. The tutorial details how to download, prepare, and use this dataset for training speech recognition models, making it a valuable reference for ASR system developers.

Fast Face Recognition Model Implemented with PaddlePaddle

2021-11-03 236 views PaddlePaddle 深度学习 Deep Learning Computer Vision Artificial Intelligence

This project develops a small and efficient face recognition system based on the ArcFace and PP-OCRv2 models. The training dataset is emore (containing 85,742 individuals and 5,822,653 images), and the lfw-align-128 dataset is used for testing. The project provides complete code and preprocessing scripts. The `create_dataset.py` script is executed to organize raw data into binary file format, improving training efficiency. Model training and evaluation are controlled by `train.py` and `eval.py` respectively. The prediction function supports

A Fast Face Recognition Model Implemented Based on PyTorch

2021-11-03 224 views Pytorch 深度学习 Pytorch Deep Learning Artificial Intelligence

This project aims to develop a face recognition system with small models, high recognition accuracy, and fast inference speed. The training data is sourced from the emore dataset (5.82 million images), and the lfw-align-128 dataset is used for testing. The project combines the ArcFace loss function and MobileNet, implemented through Python scripts. The process of training the model includes data preparation, training, and evaluation, with all code available on GitHub. To start the training process, the `train.py` command is executed; for performance verification, run `ev`

PPASR Speech Recognition (Advanced Level)

2021-09-18 252 views PaddlePaddle 深度学习语音 Speech Recognition Deep Learning PaddlePaddle

This project is an end-to-end Automatic Speech Recognition (ASR) system implemented based on Kaldi and MindSpore. The system architecture includes multiple stages such as data collection, preprocessing, model training, evaluation, and prediction. Below, I will explain each step in detail and provide some key information to help you better understand the process. ### 1. Dataset The project supports multiple datasets, such as AISHELL, Free-Spoken Chinese Mandarin Co

Sound Classification Based on PyTorch

2021-08-20 322 views 深度学习 Pytorch 语音 Python Artificial Intelligence Deep Learning Pytorch Sound Classification

This code is mainly based on the PaddlePaddle framework and is used to implement a speech recognition system based on acoustic features. The project structure is clear, including functional modules such as training, evaluation, and prediction, and provides detailed command-line parameter configuration files. The following is a detailed analysis and usage instructions for the project: ### 1. Project Structure ``` . ├── configs # Configuration files directory │ └── bi_lstm.yml ├── infer.py # Acoustic model inference code ├── recor ``` (Note: The original Chinese text was cut off at "recor" in the last line, so the translation reflects the visible content.)

Speech Recognition Model Based on PyTorch

2021-07-06 266 views 深度学习 Pytorch 语音 Pytorch Deep Learning Voiceprint Recognition Chinese voiceprint ArcNet

This project demonstrates how to use the PaddlePaddle framework for voiceprint recognition, covering multiple steps from model training to application deployment. The following are some key points and improvement suggestions for this project: ### Summary of Key Points 1. **Data Preparation**: The `prepare_data.py` in the project is used to generate a dataset containing voiceprint features. 2. **Model Design**: ECAPA-TDNN was selected as the base model, and voiceprint recognition tasks were implemented through custom configurations. 3. **Training Process**: In the training...

Chinese Speaker Recognition Based on TensorFlow 2

2021-07-06 250 views TensorFlow 深度学习语音 Tensorflow Deep Learning Voiceprint Recognition Chinese Voiceprint Recognition ArcFace

This project well demonstrates how to use deep learning models for voiceprint recognition and voiceprint comparison. Below, I will optimize and improve the code and provide some suggestions to better implement these functions. ### 1. Project Structure First, ensure the project directory structure is clear and easy to understand, for example: ``` VoiceprintRecognition/ ├── data/ │ ├── train_data/ │ │ └── user_01.wav │ ├── test_ ``` (Note: The original input was cut off at "test_", so the translation includes the visible portion only.)