Ensure your server meets the following requirements:
- Ubuntu (18.04 or later)
- CentOS (7.0 or later)
- x86-64
Note: For SDK integration on other architectures, please contact sales@shengwang.cn
- CPU: 8 cores at 1.8 GHz or higher
- Memory: 2 GB (4 GB or higher recommended)
- The server must have public internet access and a public IP address.
- The server must be able to access the
.agora.io
and.agoralab.co
domains.
- Apache Maven or other build tools (this guide uses Apache Maven as an example)
- JDK 8
Please refer to the official example documentation.
Place the downloaded SDK into the examples/libs
directory.
Create a .keys
file in the examples
directory and add the APP_ID
and TOKEN
values in the following format:
APP_ID=XXX
TOKEN=XXX
Note: If you do not have the corresponding values, leave them blank.
Using MultipleConnectionPcmSendTest.sh
as an example, other tests can replace the corresponding .sh
file:
Follow these steps to perform the test:
#!/bin/bash
set -e
cd examples
./build.sh
./script/MultipleConnectionPcmSendTest.sh
- Integrate SDK: Ensure the SDK is integrated in the
examples/libs
directory. - Compile Examples: Enter the
examples
directory and runbuild.sh
to compile. - Run Test: Execute the
/script/MultipleConnectionPcmSendTest.sh
script to run the test.
- Script Execution Order: Ensure scripts are executed in the above order to avoid dependency issues.
- Test Replacement: To test other functions, simply replace the corresponding
.sh
file.
-
Create a new VAD instance:
AgoraAudioVad audioVad = new AgoraAudioVad();
-
Initialize VAD configuration:
AgoraAudioVadConfig config = new AgoraAudioVadConfig(); audioVad.initialize(config);
Call the processPcmFrame
method to process an audio frame. The frame should be 16-bit, 16 kHz, and mono PCM data:
byte[] frame = // Get audio PCM data
VadProcessResult result = audioVad.processPcmFrame(frame);
VadProcessResult
indicates the audio VAD processing result:
-
state
returns the current Voice Activity Detection (VAD) state:0
indicates no speech detected1
indicates speech start2
indicates ongoing speech3
indicates the end of the current speech segment
-
If the function is in state
1
,2
, or3
,outFrame
will contain PCM data corresponding to the VAD state.
When users want to perform ASR/TTS processing, they should send the outFrame
data to the ASR system.
When the VAD instance is no longer needed, call audioVad.destroy()
:
audioVad.destroy();
- Release the VAD instance when the ASR system is no longer needed.
- One VAD instance corresponds to one audio stream.
AgoraAudioVad
is a management class for Voice Activity Detection (VAD). Through this class, you can process and analyze audio data.
- Call the
AgoraAudioVad
constructor to create anAgoraAudioVad
object. - Use
AgoraAudioVadConfig
to configure theAgoraAudioVad
object.
Configure the AgoraAudioVad
object.
- Parameters
config
: Configuration parameters (typeAgoraAudioVadConfig
)
- Return Value
0
: Success- Non-
0
: Failure
Process audio PCM data.
- Parameters
frame
: Audio PCM data (byte array)
- Return Value
VadProcessResult
object, containing:state
: Current Voice Activity Detection (VAD) stateoutFrame
: PCM data corresponding to the VAD state
Destroy the VAD instance.
AgoraAudioVad vad = new AgoraAudioVad();
AgoraAudioVadConfig config = new AgoraAudioVadConfig();
// Set configuration parameters
int result = vad.initialize(config);
if (result == 0) {
byte[] audioFrame = // Get audio PCM data
VadProcessResult processResult = vad.processPcmFrame(audioFrame);
// Handle VAD results
}
vad.destroy();