INDIGENOUS LANGUAGES RECORDED SPEECH, STUDYAND IMPLENTATION SERVICES
Solicitation number 18-22008
Publication date
Closing date and time 2018/06/18 14:00 EDT
Description
Advance Contract Award Notice - Solicitation Number: 18-22008
INDIGENOUS LANGUAGES RECORDED SPEECH, STUDYAND IMPLENTATION SERVICES
THE NATIONAL RESEARCH COUNCIL OF CANADA’S DIGITAL TECHNOLOGIES RESEARCH CENTRE REQUIRES THE SERVICES OF AN ORGANIZATION TO CREATE SOFTWARE TOOLS AND WEB SERVICES THAT WILL CARRY OUT SEGMENTATION AND AUDIO INDEXATION ON RECORDED SPEECH IN CANADIAN INDIGENOUS LANGUAGES.
1.0 ACAN - An ACAN is a public notice indicating to the supplier community that a department or agency intends to award a contract for goods, services or construction to a pre-identified supplier, thereby allowing other suppliers to signal their interest in bidding, by submitting a statement of capabilities. If no supplier submits a statement of capabilities that meets the requirements set out in the ACAN, on or before the closing date stated in the ACAN, the contracting officer may then proceed with the award to the pre-identified supplier.
2.0 Background: NRC’s Digital Technologies Research Centre focuses on advanced software for processing human language, pattern recognition, augmented intelligence and machine learning.
2.1 Mission: The Indigenous Languages Technology program within the Digital Technologies Research Centre has a mandate to develop speech- and text‑based technologies for assisting the revitalization and preservation of Indigenous languages by supporting Indigenous language educators and students, promoting the accessibility of audio recordings, and supporting Indigenous language translators, transcribers and other language professionals. The project covered by the contract relates to promoting the accessibility of audio recordings containing speech in Indigenous languages.
3.0 SCOPE OF WORK: The contractor will be responsible for developing, training and experimentation of the speech algorithms and models, and the hosting of web service instances for prototyping, for the project. The contractor will also fund initiatives directly related to the project objectives, using a part of the budget earmarked for partners. Most of the deliverables fall into two categories: those related to the segmentation sub-project, and those related to the audio indexation sub-project. A third type of deliverable is related to transcription tools.
Segmentation is the process by which the boundaries of utterances in a given recording are identified, marked off, and attributed to particular speakers. It can be subdivided into speaker segmentation, which identifies segments spoken by a particular person, and language segmentation, which identifies segments in a particular language (so that, e.g., Cree speech segments in an audio file can be separated from English and French speech segments). Segmentation is a necessary step in current best practice in language documentation and has been identified as an important bottleneck for language study and preservation. Audio indexation uses speech recognition to make audiovisual recordings searchable by their spoken content using text queries. It will enable people in a given Indigenous language community to easily access and navigate audiovisual documents in their language. In contrast with other speech recognition applications, it does not require accurate, word-by-word transcription, and is easier to achieve for languages which don’t have thousands of hours of already transcribed recordings. Transcription tools that will make human transcribers of speech in Indigenous languages more productive will also be developed in the project.
3.1 S1. Speaker segmentation sub-project
The Contractor must:
- Establish baselines with published algorithms and toolkits, evaluated on in-domain data. Adapt/train, reevaluate on in-domain data.
- Develop segmentation algorithms that work left to right before entire recording is completed. Compare with conventional algorithm performance.
- Extend speaker verification methods to exploit information provided by user to label all segments uttered by a given speaker.
- Adapt methods for interactive use, such as query-by-example speaker
S2. Language segmentation sub-project
The Contractor must:
- Establish baselines with published algorithms and toolkits, evaluated on in-domain data. Adapt/train, reevaluate on in-domain data
- Extend language identification methods to segmentation of recordings that contain many
languages. Develop language segmentation algorithm that work left to right without a complete recording.
- Combine speaker and language segmentation (requires identifying same speaker in different languages).
- Evaluate usefulness and transcription acceleration over existing methods.
A. Audio indexation sub-project
The Contractor must:
- Develop query-by-example word capabilities, based on audio or underlying phonetic representation.
- Solve problems specific to Indigenous languages for training pronunciation dictionary and training language model from text. Train acoustic models from large amounts of transcribed recordings in one language. Work on audio and text data validation, normalization, cleaning, etc.
- Develop multilingual training methods for pooling small amounts of transcribed recordings, semi- or unsupervised training.
- Assemble speech recognition components to provide a service for keyword indexation and searching.
T. Transcription tool sub-project
The Contractor must:
- Develop software for computer-assisted transcription tool, both for the desktop and as a web service (with integration into existing transcription software).
3.2 Credentials of the contracted organization:
The organization chosen as the contractor must:
- Have solid research credentials in the field of speaker recognition (for instance, a history of success in international evaluations of speaker recognition);
- Have solid research credentials in the field of speech recognition (for instance, a history of success in international evaluations of speech recognition)
- Have a demonstrated record of transferring speech technology – especially audio indexation and speech recognition systems – to other organizations, both in the private and government sectors.
- Have a demonstrated record of providing publicly available web services based on speech technology.
- Have the ability to communicate with NRC in both official languages.
4.0 Deliverables and expectations:
(Deliverables are listed by quarter – e.g., “Q1” would indicate a deliverable that will be delivered at the end of the first 3 months of the project)
- Two meetings/workshops involving all participants in the project: Q4 and Q8.
- Computer-assisted transcription tools version 1 (online speaker segmentation & online language segmentation): Q4
- Keyword search for languages L1 and L2: Q5
- Computer-assisted transcription tools version 2 (active speaker labeling & joint speaker and language segmentation): Q6.
- Low-resource for L3 and L4 speech recognition components: Q6
- Keyword search for languages L3 and L4: Q7
5.0 Justification for the Pre-Identified Supplier: The National Research Council wishes to award a sole source contract to the Computer Research Institute of Montreal (CRIM). CRIM was chosen for this agreement as they are the only known vendor with the unique skill set for developing, training, and perfecting through experimentation the speech algorithms and models required to segment and index speech files in certain Canadian Indigenous languages.
6.0 This procurement is subject to the following trade agreement(s):
Agreement on Internal Trade (AIT)
North American Free Trade Agreement (NAFTA)
World Trade Organization on Government Procurement
7.0 Government Contracts Regulations Exception(s): Only one vendor is capable of performing the work. Exclusions and/or Limited Tendering Reasons.
Where, for works of art, or for reasons connected with the protection of patents, copyrights or other exclusive rights, or proprietary information or where there is an absence of competition for technical reasons, the goods or services can be supplied only by a particular supplier and no reasonable alternative or substitute exists
8.0 Ownership of Intellectual Property: Ownership of any Foreground Intellectual Property arising out of the proposed contract will vest in the Contractor.
9.0 Supplier’s Right to Submit a Statement of Capabilities: Suppliers who consider themselves fully qualified and available to provide the services described in the ACAN may submit a statement of capabilities in writing to the contact person identified in this notice on or before the closing date of this notice. The statement of capabilities must clearly demonstrate how the supplier meets the advertised requirements.
10.0 Inquiries and statement of capabilities are to be directed to:
National Research Council Canada - Katie Homuth - 613-998-7763
Email: Katie.Homuth@nrc-cnrc.gc.ca
Solicitation closing date June 18, 2018 at 2pm EDT
11.0 Estimated Contract value: - $900,000 to $1,200,000 CDN
This contract is for a period of 22 months, from June 1, 2018 to March 31, 2020
12.0 Vendor: Computer Research Institute of Montreal (CRIM) 405, Ogilvy Avenue, suite 101 Montréal (Québec) H3N 1M3
Contract duration
Refer to the description above for full details.
Trade agreements
-
World Trade Organization Agreement on Government Procurement (WTO GPA)
Contact information
Contracting organization
- Organization
-
National Research Council Canada
- Address
-
100 Sussex DrOttawa, Ontario, K1A0R6Canada
- Contracting authority
- Homuth, Katie
- Phone
- 613-998-7763
- Email
- Katie.Homuth@nrc-cnrc.gc.ca
- Address
-
1200 Montreal RoadOttawa, ON, K1A 0R6CA
Buying organization(s)
- Organization
-
National Research Council Canada
- Address
-
100 Sussex DrOttawa, Ontario, K1A0R6Canada
Bidding details
Details for this tender opportunity are provided in the Description tab.
Please contact the contracting officer to get the full solicitation documentation, access information on how to bid, or if you have any questions regarding this tender opportunity.
Note that there may be fees to access the documents or bid. These should be outlined in the Description tab.
We recommend that you contact the contracting officer as soon as possible, as there may be deadlines for receiving questions.