How to Implement Dictation Word Broadcast: Super Simple Integration With ML Kit Can Achieve This!
This article explains how to implement dictation word broadcast by using an automatic voice broadcast app that uses functions of an ML kit.
Join the DZone community and get the full member experience.
Join For FreeIntroduction
I believe we all have dictation when we first start learning a language. Nowadays, an important after-school assignment for primary school students is the dictation of new words in the text. Many parents have this experience. On the one hand, this kind of word reading is relatively simple, and on the other hand, parents' time is precious. Nowadays, many dictations are available in the market. These speakers record the words after class in Chinese textbooks for parents to download. However, this kind of recording is not flexible enough; if the teacher left a few extra words today that were not part of the after-school exercise, the recording would not satisfy the needs of parents and children. This document describes an automatic voice broadcast app that uses the universal text recognition function and speech synthesis function of our ML kit. You only need to take photos of dictated words or texts, and then, the text in the photos can be automatically broadcast. The tone can be adjusted.
Development Preparations
1. Open the AndroidStudio project-level build.gradle file.
2. Choose allprojects > repositories and configure the Maven repository address of the HMS SDK.
allprojects {
repositories {
google()
jcenter()
maven {url 'http://developer.huawei.com/repo/'}
}
}
3. Choose buildscript > repository and configure the Maven repository address of the HMS SDK.
xxxxxxxxxx
buildscript {
repositories {
google()
jcenter()
maven {url 'http://developer.huawei.com/repo/'}
}
}
4. Choose buildscript > dependencies and configure the AGC plug-in.
xxxxxxxxxx
dependencies {
classpath 'com.huawei.agconnect:agcp:1.2.1.301'
}
Adding Compilation Dependencies
1. Open the application-level build.gradle file.
2. Integrating SDK:
xxxxxxxxxx
dependencies{
implementation 'com.huawei.hms:ml-computer-voice-tts:1.0.4.300'
implementation 'com.huawei.hms:ml-computer-vision-ocr:1.0.4.300'
implementation 'com.huawei.hms:ml-computer-vision-ocr-cn-model:1.0.4.300'
}
3. Add the ACG plug-in to the file header.
xxxxxxxxxx
apply plugin: 'com.huawei.agconnect'
4. Specify permissions and features: declare them in the AndroidManifest.xml file.
xxxxxxxxxx
private static final int PERMISSION_REQUESTS = 1;
public void onCreate(Bundle savedInstanceState) {
// Checking camera permission
if (!allPermissionsGranted()) {
getRuntimePermissions();
}
}
Key Steps of Reading Code Aloud
There are two functions: one is to recognize the text, and the other is to read aloud. You can read aloud by using OCR+TTS. After taking a photo, you can click to play the photo.
1. Dynamic permission application.
xxxxxxxxxx
private static final int PERMISSION_REQUESTS = 1;
public void onCreate(Bundle savedInstanceState) {
// Checking camera permission
if (!allPermissionsGranted()) {
getRuntimePermissions();
}
}
2. Start reading interface.
xxxxxxxxxx
public void takePhoto(View view) {
Intent intent = new Intent(MainActivity.this, ReadPhotoActivity.class);
startActivity(intent);
}
3. Invoke createLocalTextAnalyzer() in the onCreate() method to create a text recognizer on the device.
xxxxxxxxxx
private void createLocalTextAnalyzer() {
MLLocalTextSetting setting = new MLLocalTextSetting.Factory()
.setOCRMode(MLLocalTextSetting.OCR_DETECT_MODE)
.setLanguage("zh")
.create();
this.textAnalyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyzer(setting);
}
4. Invoke createTtsEngine() in the onCreate() method to create a TTS engine, construct a TTS callback to process the TTS result, and transfer the TTS callback to the new TTS engine.
xxxxxxxxxx
private void createTtsEngine() {
MLTtsConfig mlConfigs = new MLTtsConfig()
.setLanguage(MLTtsConstants.TTS_ZH_HANS)
.setPerson(MLTtsConstants.TTS_SPEAKER_FEMALE_ZH)
.setSpeed(0.2f)
.setVolume(1.0f);
this.mlTtsEngine = new MLTtsEngine(mlConfigs);
MLTtsCallback callback = new MLTtsCallback() {
public void onError(String taskId, MLTtsError err) {
}
public void onWarn(String taskId, MLTtsWarn warn) {
}
public void onRangeStart(String taskId, int start, int end) {
}
public void onEvent(String taskId, int eventName, Bundle bundle) {
if (eventName == MLTtsConstants.EVENT_PLAY_STOP) {
if (!bundle.getBoolean(MLTtsConstants.EVENT_PLAY_STOP_INTERRUPTED)) {
Toast.makeText(ReadPhotoActivity.this.getApplicationContext(), R.string.read_finish, Toast.LENGTH_SHORT).show();
}
}
}
};
mlTtsEngine.setTtsCallback(callback);
}
5. Set the read, take, and read buttons.
xxxxxxxxxx
this.relativeLayoutLoadPhoto.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
ReadPhotoActivity.this.selectLocalImage(ReadPhotoActivity.this.REQUEST_CHOOSE_ORIGINPIC);
}
});
this.relativeLayoutTakePhoto.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
ReadPhotoActivity.this.takePhoto(ReadPhotoActivity.this.REQUEST_TAKE_PHOTO);
}
});
6. Enable text recognition in photo and photo read callbacks.
xxxxxxxxxx
private void startTextAnalyzer() {
if (this.isChosen(this.originBitmap)) {
MLFrame mlFrame = new MLFrame.Creator().setBitmap(this.originBitmap).create();
Task<MLText> task = this.textAnalyzer.asyncAnalyseFrame(mlFrame);
task.addOnSuccessListener(new OnSuccessListener<MLText>() {
public void onSuccess(MLText mlText) {
// Transacting logic for segment success.
if (mlText != null) {
ReadPhotoActivity.this.remoteDetectSuccess(mlText);
} else {
ReadPhotoActivity.this.displayFailure();
}
}
}).addOnFailureListener(new OnFailureListener() {
public void onFailure(Exception e) {
// Transacting logic for segment failure.
ReadPhotoActivity.this.displayFailure();
return;
}
});
} else {
Toast.makeText(this.getApplicationContext(), R.string.please_select_picture, Toast.LENGTH_SHORT).show();
return;
}
}
7. After the recognition is successful, click the play button to start the playback.
xxxxxxxxxx
this.relativeLayoutRead.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
if (ReadPhotoActivity.this.sourceText == null) {
Toast.makeText(ReadPhotoActivity.this.getApplicationContext(), R.string.please_select_picture, Toast.LENGTH_SHORT).show();
} else {
ReadPhotoActivity.this.mlTtsEngine.speak(sourceText, MLTtsEngine.QUEUE_APPEND);
Toast.makeText(ReadPhotoActivity.this.getApplicationContext(), R.string.read_start, Toast.LENGTH_SHORT).show();
}
}
});
Conclusion
Now, you know how to implement a dictation word broadcast function in just a few minutes.
Opinions expressed by DZone contributors are their own.
Comments