Google Developers Blog


14th September 2017 |

Actions on Google is now available in Australia
Posted by Brad Abrams, Product Manager

Last month we announcedthat UK users can access apps for the Google Assistant on Google Home and their phones—and starting today, we're bringing Actions on Google to Australia. From Perth to Sydney, developers can start building apps for the Google Assistant, giving their userseven more ways to get things done.

Similar to our launch in the UK, your English apps will appear in the local directory automatically. With that said, there are a few things to help make your app a true blue Aussie:

  • New TTS voices: There are a number of new TTS voices with an Australian (english) accent. We've automatically selected one for your app but you can change the selected voice or opt to use your current English US or UK voice by going to the actions console.
  • Practice makes perfect: We also recommend reviewing your response text strings andmaking adjustments to accommodate for differences between the languages, like making sure you know the important things, like candy should be lollies and servo is a gas station.

Our developer tools, documentationand simulatorhave all been updated to make it easy for you to create, test and deploy your app. So what are you waiting for?

UK and Aussie users are just the start, we'll continue to make the Actions on Google platform available in more languages over the coming year. If you have questions about internationalization, please reach out to us on Stackoverflowand Google+.


12th September 2017 |

Introduction to TensorFlow Datasets and Estimators
Posted by The TensorFlow Team

TensorFlow 1.3 introduces two important features that you should try out:

  • Datasets: A completely new way of creating input pipelines (that is, reading data into your program).
  • Estimators: A high-level way to create TensorFlow models. Estimators include pre-made models for common machine learning tasks, but you can also use them to create your own custom models.

Below you can see how they fit in the TensorFlow architecture. Combined, they offer an easy way to create TensorFlow models and to feed data to them:

Our Example Model

To explore these features we're going to build a model and show you relevant code snippets. The complete code is available here, including instructions for getting the training and test files. Note that the code was written to demonstrate how Datasets and Estimators work functionally, and was not optimized for maximum performance.

The trained model categorizes Iris flowers based on four botanical features (sepal length, sepal width, petal length, and petal width). So, during inference, you can provide values for those four features and the model will predict that the flower is one of the following three beautiful variants:

From left to right: Iris setosa(by Radomil, CC BY-SA 3.0), Iris versicolor (by Dlanglois, CC BY-SA 3.0), and Iris virginica(by Frank Mayfield, CC BY-SA 2.0).

We're going to train a Deep Neural Network Classifier with the below structure. All input and output values will be float32, and the sum of the output values will be 1 (as we are predicting the probability for each individual Iris type):

For example, an output result might be 0.05 for Iris Setosa, 0.9 for Iris Versicolor, and 0.05 for Iris Virginica, which indicates a 90% probability that this is an Iris Versicolor.

Alright! Now that we have defined the model, let's look at how we can use Datasets and Estimators to train it and make predictions.

Introducing The Datasets

Datasets is a new way to create input pipelines to TensorFlow models. This API is much more performant than using feed_dict or the queue-based pipelines, and it's cleaner and easier to use. Although Datasets still resides in at 1.3, we expect to move this API to core at 1.4, so it's high time to take it for a test drive.

At a high-level, the Datasets consists of the following classes:


  • Dataset: Base class containing methods to create and transform datasets. Also allows you initialize a dataset from data in memory, or from a Python generator.
  • TextLineDataset: Reads lines from text files.
  • TFRecordDataset: Reads records from TFRecord files.
  • FixedLengthRecordDataset: Reads fixed size records from binary files.
  • Iterator: Provides a way to access one dataset element at a time.

Our dataset

To get started, let's first look at the dataset we will use to feed our model. We'll read data from a CSV file, where each row will contain five values-the four input values, plus the label:

The label will be:

  • 0 for Iris Setosa
  • 1 for Versicolor
  • 2 for Virginica.

Representing our dataset

To describe our dataset, we first create a list of our features:

feature_names = [

When we train our model, we'll need a function that reads the input file and returns the feature and label data. Estimators requires that you create a function of the following format:

def input_fn():
return ({ 'SepalLength':[values], ..<etc>.., 'PetalWidth':[values] },

The return value must be a two-element tuple organized as follows: :

  • The first element must be a dict in which each input feature is a key, and then a list of values for the training batch.
  • The second element is a list of labels for the training batch.

Since we are returning a batch of input features and training labels, it means that all lists in the return statement will have equal lengths. Technically speaking, whenever we referred to "list" here, we actually mean a 1-d TensorFlow tensor.

To allow simple reuse of the input_fn we're going to add some arguments to it. This allows us to build input functions with different settings. The arguments are pretty straightforward:

  • file_path: The data file to read.
  • perform_shuffle: Whether the record order should be randomized.
  • repeat_count: The number of times to iterate over the records in the dataset. For example, if we specify 1, then each record is read once. If we specify None, iteration will continue forever.

Here's how we can implement this function using the Dataset API. We will wrap this in an "input function" that is suitable when feeding our Estimator model later on:

def my_input_fn(file_path, perform_shuffle=False, repeat_count=1):
def decode_csv(line):
parsed_line = tf.decode_csv(line, [[0.], [0.], [0.], [0.], [0]])
label = parsed_line[-1:] # Last element is the label
del parsed_line[-1] # Delete last element
features = parsed_line # Everything (but last element) are the features
d = dict(zip(feature_names, features)), label
return d

dataset = ( # Read text file
.skip(1) # Skip header row
.map(decode_csv)) # Transform each elem by applying decode_csv fn
if perform_shuffle:
# Randomizes input using a window of 256 elements (read into memory)
dataset = dataset.shuffle(buffer_size=256)
dataset = dataset.repeat(repeat_count) # Repeats dataset this # times
dataset = dataset.batch(32) # Batch size to use
iterator = dataset.make_one_shot_iterator()
batch_features, batch_labels = iterator.get_next()
return batch_features, batch_labels

Note the following: :

  • TextLineDataset: The Dataset API will do a lot of memory management for you when you're using its file-based datasets. You can, for example, read in dataset files much larger than memory or read in multiple files by specifying a list as argument.
  • shuffle: Reads buffer_size records, then shuffles (randomizes) their order.
  • map: Calls the decode_csv function with each element in the dataset as an argument (since we are using TextLineDataset, each element will be a line of CSV text). Then we apply decode_csv to each of the lines.
  • decode_csv: Splits each line into fields, providing the default values if necessary. Then returns a dict with the field keys and field values. The map function updates each elem (line) in the dataset with the dict.

That's an introduction to Datasets! Just for fun, we can now use this function to print the first batch:

next_batch = my_input_fn(FILE, True) # Will return 32 random elements

# Now let's try it out, retrieving and printing one batch of data.
# Although this code looks strange, you don't need to understand
# the details.
with tf.Session() as sess:
first_batch =

# Output
({'SepalLength': array([ 5.4000001, ...<repeat to 32 elems>], dtype=float32),
'PetalWidth': array([ 0.40000001, ...<repeat to 32 elems>], dtype=float32),
[array([[2], ...<repeat to 32 elems>], dtype=int32) # Labels

That's actually all we need from the Dataset API to implement our model. Datasets have a lot more capabilities though; please see the end of this post where we have collected more resources.

Introducing Estimators

Estimators is a high-level API that reduces much of the boilerplate code you previously needed to write when training a TensorFlow model. Estimators are also very flexible, allowing you to override the default behavior if you have specific requirements for your model.

There are two possible ways you can build your model using Estimators:

  • Pre-made Estimator - These are predefined estimators, created to generate a specific type of model. In this blog post, we will use the DNNClassifier pre-made estimator.
  • Estimator (base class) - Gives you complete control of how your model should be created by using a model_fn function. We will cover how to do this in a separate blog post.

Here is the class diagram for Estimators:

We hope to add more pre-made Estimators in future releases.

As you can see, all estimators make use of input_fn that provides the estimator with input data. In our case, we will reuse my_input_fn, which we defined for this purpose.

The following code instantiates the estimator that predicts the Iris flower type:

# Create the feature_columns, which specifies the input to our model.
# All our input features are numeric, so use numeric_column for each one.
feature_columns = [tf.feature_column.numeric_column(k) for k in feature_names]

# Create a deep neural network regression classifier.
# Use the DNNClassifier pre-made estimator
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns, # The input features to our model
hidden_units=[10, 10], # Two layers, each with 10 neurons
model_dir=PATH) # Path to where checkpoints etc are stored

We now have a estimator that we can start to train.

Training the model

Training is performed using a single line of TensorFlow code:

# Train our model, use the previously function my_input_fn
# Input to training is a file with training example
# Stop training after 8 iterations of train data (epochs)
input_fn=lambda: my_input_fn(FILE_TRAIN, True, 8))

But wait a minute... what is this "lambda: my_input_fn(FILE_TRAIN, True, 8)" stuff? That is where we hook up Datasets with the Estimators! Estimators needs data to perform training, evaluation, and prediction, and it uses the input_fn to fetch the data. Estimators require an input_fn with no arguments, so we create a function with no arguments using lambda, which calls our input_fn with the desired arguments: the file_path, shuffle setting, and repeat_count. In our case, we use our my_input_fn, passing it:

  • FILE_TRAIN, which is the training data file.
  • True, which tells the Estimator to shuffle the data.
  • 8, which tells the Estimator to and repeat the dataset 8 times.

Evaluating Our Trained Model

Ok, so now we have a trained model. How can we evaluate how well it's performing? Fortunately, every Estimator contains an evaluatemethod:

# Evaluate our model using the examples contained in FILE_TEST
# Return value will contain evaluation_metrics such as: loss & average_loss
evaluate_result = estimator.evaluate(
input_fn=lambda: my_input_fn(FILE_TEST, False, 4)
print("Evaluation results")
for key in evaluate_result:
print(" {}, was: {}".format(key, evaluate_result[key]))

In our case, we reach an accuracy of about ~93%. There are various ways of improving this accuracy, of course. One way is to simply run the program over and over. Since the state of the model is persisted (in model_dir=PATH above), the model will improve the more iterations you train it, until it settles. Another way would be to adjust the number of hidden layers or the number of nodes in each hidden layer. Feel free to experiment with this; please note, however, that when you make a change, you need to remove the directory specified in model_dir=PATH, since you are changing the structure of the DNNClassifier.

Making Predictions Using Our Trained Model

And that's it! We now have a trained model, and if we are happy with the evaluation results, we can use it to predict an Iris flower based on some input. As with training, and evaluation, we make predictions using a single function call:

# Predict the type of some Iris flowers.
# Let's predict the examples in FILE_TEST, repeat only once.
predict_results = classifier.predict(
input_fn=lambda: my_input_fn(FILE_TEST, False, 1))
print("Predictions on test file")
for prediction in predict_results:
# Will print the predicted class, i.e: 0, 1, or 2 if the prediction
# is Iris Sentosa, Vericolor, Virginica, respectively.
print prediction["class_ids"][0]

Making Predictions on Data in Memory

The preceding code specified FILE_TEST to make predictions on data stored in a file, but how could we make predictions on data residing in other sources, for example, in memory? As you may guess, this does not actually require a change to our predict call. Instead, we configure the Dataset API to use a memory structure as follows:

# Let create a memory dataset for prediction.
# We've taken the first 3 examples in FILE_TEST.
prediction_input = [[5.9, 3.0, 4.2, 1.5], # -> 1, Iris Versicolor
[6.9, 3.1, 5.4, 2.1], # -> 2, Iris Virginica
[5.1, 3.3, 1.7, 0.5]] # -> 0, Iris Sentosa
def new_input_fn():
def decode(x):
x = tf.split(x, 4) # Need to split into our 4 features
# When predicting, we don't need (or have) any labels
return dict(zip(feature_names, x)) # Then build a dict from them

# The from_tensor_slices function will use a memory structure as input
dataset =
dataset =
iterator = dataset.make_one_shot_iterator()
next_feature_batch = iterator.get_next()
return next_feature_batch, None # In prediction, we have no labels

# Predict all our prediction_input
predict_results = classifier.predict(input_fn=new_input_fn)

# Print results
print("Predictions on memory data")
for idx, prediction in enumerate(predict_results):
type = prediction["class_ids"][0] # Get the predicted class (index)
if type == 0:
print("I think: {}, is Iris Sentosa".format(prediction_input[idx]))
elif type == 1:
print("I think: {}, is Iris Versicolor".format(prediction_input[idx]))
print("I think: {}, is Iris Virginica".format(prediction_input[idx])

Dataset.from_tensor_slides() is designed for small datasets that fit in memory. When using TextLineDataset as we did for training and evaluation, you can have arbitrarily large files, as long as your memory can manage the shuffle buffer and batch sizes.


Using a pre-made Estimator like DNNClassifier provides a lot of value. In addition to being easy to use, pre-made Estimators also provide built-in evaluation metrics, and create summaries you can see in TensorBoard. To see this reporting, start TensorBoard from your command-line as follows:

# Replace PATH with the actual path passed as model_dir argument when the
# DNNRegressor estimator was created.
tensorboard --logdir=PATH

The following diagrams show some of the data that TensorBoard will provide:


In this this blogpost, we explored Datasets and Estimators. These are important APIs for defining input data streams and creating models, so investing time to learn them is definitely worthwhile!

For more details, be sure to check out

But it doesn't stop here. We will shortly publish more posts that describe how these APIs work, so stay tuned for that!

Until then, Happy TensorFlow coding!


6th September 2017 |

Making the Google Developers documentation style guide public
Posted by Jed Hartman, Technical Writer

You can now use our developer-documentation style guide for open source documentation projects.

For some years now, our technical writers at Google have used an internal-only editorial style guide for most of our developer documentation. In order to better support external contributors to our open source projects, such as Kubernetes, AMP, or Dart, and to allow for more consistency across developer documentation, we're now making that style guide public.

If you contribute documentation to projects like those, you now have direct access to useful guidance about voice, tone, word choice, and other style considerations. It can be useful for general issues, like reminders to use second person, present tense, active voice, and the serial comma; it can also be great for checking very specific issues, like whether to write "app" or "application" when you want to be consistent with the Google Developers style.

The style guide is a reference document, so instead of reading through it in linear order, you can use it to look things up as needed. For matters of punctuation, grammar, and formatting, you can do a search-in-page to find items like "Commas," "Lists," and "Link text" in the left nav. For specific terms and phrases, you can look at the word list.

Keep an eye on the guide's release notes pagefor updates and developments, and send us your comments and suggestions via the Send Feedback link on each page of the guide—we want to hear from you as we continue to evolve the style guide.


5th September 2017 |

Introducing the Mobile Web Specialist Certification by Google Developers
Posted by Sarah Clark, Program Manager, Web Developer Training
If you're a web developer, it's a crowded market, and you likely want to set yourself apart from other web developers. Would you like to show that you have the skills to create responsive and flexible web applications?
The Google Developers Certification Team is pleased to announce the Mobile Web Specialist Certification. Based on a thorough analysis of the market, this new certification highlights developers who have in-demand skills as mobile web developers. (But don't worry, the skills demonstrated in this exam can be used on the desktop and across all browsers.)
Use our Mobile Web Specialist Study Guide to help you prepare. When you're ready to take the exam, you will write code in a timed, performance-based exam. The cost for certification is $99 USD (6500 INR if you reside in India) and includes up to three exam attempts.
Check out this short video for a quick overview of the Mobile Web Specialist certification process:
Earning your Mobile Web Specialist Certification gives you a digital badge to display on your resume and social media profiles. As a member of the Mobile Web Specialist Alumni Community, you will also have access to program benefits focused on increasing your visibility as a certified developer.
The Mobile Web Specialist Certification joins the Associate Android Developer Certification in Google's family of performance-based certifications.
Visit to get started and earn your Google Mobile Web Specialist Certification.


1st September 2017 |

Bringing Real-time Spatial Audio to the Web with Songbird
Posted by Jamieson Brettle and Drew Allen, Chrome Media Team

For a virtual scene to be truly immersive, stunning visuals need to be accompanied by true spatial audio to create a realistic and believable experience. Spatial audio tools allow developers to include sounds that can come from any direction, and that are associated in 3D space with audio sources, thus completely enveloping the user in 360-degree sound.

Spatial audio helps draw the user into a scene and creates the illusion of entering an entirely new world. To make this possible, the Chrome Media team has created Songbird, an open source, spatial audio encoding engine that works in any web browser by using the Web Audio API.

The Songbird library takes in any number of mono audio streams and allows developers to programmatically place them in 3D space around the user. Songbird allows you to create immersive soundscapes, realistically reproducing reflection and reverb for the space you describe. Sounds bounce off walls and reflect off materials just as they would in real-life, capturing truly 360-degree sound. Songbird creates an ambisonic soundfield that can then be rendered in real-time for use in your application. We've partnered with the Omnitoneproject, which we blogged about last year, to add higher-order ambisonic support to Omnitone's binaural rendererto produce far more accurate sounding audio than ever before.

Songbird encapsulates Omnitone and with it, developers can now add interactive, full-sphere audio to any web based application. Songbird can scale to any order ambisonics, thereby creating a more realistic sound and higher performance than what is achievable through standard Web Audio API.

Songbird Audio Processing Diagram

The implementation of Songbird is based on the Google spatial mediaspecification. It expects mono input and outputs ambisonic (multichannel) ACN channel layout with SN3D normalization. Detailed documentation may be found here.

As the web emerges as an important VR platformfor delivering content, spatial audio will play a vital role in users' embrace of this new medium. Songbird and Omnitone are key tools in enabling spatial audio on the web platform and establishing it as a preeminent platform for compelling VR experiences. Combining these audio experiences with 3D JavaScript libraries like three.js gives a glimpseinto the future on the web.

Demo combining spatial sound in 3D environment

This project was made possible through close collaboration with Google's Daydream and Web Audio teams. This collaboration allowed us to deliver similar audio capabilities to the web as are available to developers creating Daydream applications.

We look forward to seeing what people do with Songbird now that it's open source. Check out the code on GitHub and let us know what you think. Also available are a number of demoson creating full spherical audio with Songbird.


5th September 2017 |

Launchpad Accelerator is open to more countries around the world! Apply now.
Posted by Roy Glasberg, Global Lead, Launchpad Program & Accelerator

Launchpad Accelerator gives us an opportunity to work with and empower amazing developers, who are solving major challenges all around the world -- whether it's streamlining digital commerce across Africa, providing access to multimedia tools that support special needs education, or using AI to simplify business operations.

That's why we're doubling down on our efforts and opening up applications for the next class of the program to more countries for the first time starting today. Here's the full list of the new additions:

  • Africa: Algeria, Egypt, Ghana, Morocco, Tanzania, Tunisia & Uganda
  • Asia: Bangladesh, Myanmar, Pakistan & Sri Lanka
  • Europe: Estonia, Romania, Ukraine, Belarus & Russia
  • Latin America: Costa Rica, Panama, Peru & Uruguay

They'll be joined by our larger list of countries that are already part of the program, including: Argentina, Brazil, Chile, Colombia, Czech Republic, Hungary, India, Indonesia, Kenya, Malaysia, Mexico, Nigeria, Philippines, Poland, South Africa, Thailand, and Vietnam.

The application process for the equity-free program will end on October 2, 2017 at 9AM PST. Later in the year, the list of selected developers will be invited to the Google Developers Launchpad Space in San Francisco for 2 weeks of all-expense-paid training.

What are the benefits?

The training at Google HQ includes intensive mentoring from 20+ Google teams, and expert mentors from top technology companies and VCs in Silicon Valley. Participants receive equity-free support, credits for Google products, PR support and continue to work closely with Google back in their home country during the 6-month program. Hear from some alumnus about their experiences here.

What do we look for when selecting startups?

Each startup that applies to the Launchpad Accelerator is considered holistically and with great care. Below are general guidelines behind our process to help you understand what we look for in our candidates.

All startups in the program must:

  • Be a technological startup.
  • Be targeting their local markets.
  • Have proven product-market fit (beyond ideation stage).
  • Be based in the countries listed above.

Additionally, we are interested in what kind of startup you are. We also consider:

  • The problem you are trying to solve. How does it create value for users? How are you addressing a real challenge for your home city, country or region?
  • Does your management team have a leadership mindset and the drive to become an influencer?
  • Will you share what you learn in Silicon Valley for the benefit of other startups in your local ecosystem?
  • If you're based outside of these countries, stay tuned, as we expect to add more countries to the program in the future.

We can't wait to hear from you and see how we can work together to improve your business.

Participants from Class 4


31st August 2017 |

Google Play Developer API new fields for In-app Billing information
Posted by Neto Marin, Developer Advocate

We'd like to share with you some good news about an improvement in the data available via the Google Play Developer API. Starting Monday Aug 28, the API for Purchases.productsand Purchases.subscriptionswill be returning a couple of new values:

  • orderId
    • To be returned via both products and subscriptions API
      • For Purchases, this will be the order id present in the purchase.
      • For subscriptions, this will be the orderId associated with the most recent recurring order id.
  • New subscription cancelReason: 2. Subscription replaced
    • Will be returned for subscriptions which were canceled due to the user changing subscription plans (e.g. upgrading to a new subscription plan).

This additional data will be automatically returned to you in the JSON responses to your API calls. Please double check your integration to make sure this new field and value will not cause any problems for you.

To view all of the values returned by the APIs, check Purchases.productsand Purchases.subscriptionsreference pages.


29th August 2017 |

AIY Projects update: new maker projects, new partners, new kits

Posted by Billy Rutledge, Director, AIY Projects

Makers are hands-on when it comes to making change. We're explorers, hackers and problem solvers who build devices, ecosystems, art (sometimes a combination of the three) on the basis of our own (often unconventional) ideas. So when my team first sought out to empower makers of all types and ages with the AI technology we've honed at Google, we knew whatever we built had to be open and accessible. We stayed clear of limitations that come from platform and software stack requirements, high cost and complex set up, and fixed our focus on the curiosity and inventiveness that inspire makers around the world.

When we launched our Voice Kit with help from our partner Raspberry Pi in May and sold out globally in just a few hours, we got the message loud and clear. There is a genuine demand among do-it-yourselfers for artificial intelligence that makes human-to-machine interaction more like natural human interaction.

Last week we announced the Speech Commands Dataset, a collaboration between the TensorFlow and AIY teams. The dataset has 65,000 one-second long utterances of 30 short words by thousands of different contributors of the AIY websiteand allows you to build simple voice interfaces for applications. We're currently in the process of integrating the dataset with the next release of the Voice Kit, so makers could build devices that respond to simple voice commands without the press of a button or an internet connection.

Today, you can pre-order your Voice Kit, which will be available for purchase in stores and online through Micro Center.

Or you may have to resort to the hackthat maker Shivasiddarthcreated when Voice Kit with MagPi #57 sold out in May, and then again (within 17 minutes) earlier this month.

Cool ways that makers are already using the Voice Kit

Martin Mander created a retro-inspired intercom that he calls 1986 Google Pi Intercom. He describes it as "a wall-mounted Google voice assistant using a Raspberry PI 3 and the Google AIY (Artificial Intelligence Yourself) [voice] kit." He used a mid-80s intercom that he bought on sale for £4. It cleaned up well!

Get the full story from Martin and see what Slashgear had to say about the project.

(This one's for Dr. Who fans) Tom Minnich created a Dalek-voiced assistant.

He offers a tutorialon how you can modify the Voice Kit to do something similar — perhaps create a Drogon-voiced assistant?

Victor Van Heeused the Voice Kit to create a voice-activated internet streaming radio that can play other types of audio files as well. He provides instructions, so you can do the same.

The Voice Kit is currently available in the U.S. We'll be expanding globally by the end of this year. Stay tuned here, where we'll share the latest updates. The strong demand for the Voice Kit drives us to keep the momentum going on AIY Projects.

Inspiring makers with kits that understand human speech, vision and movement

What we build next will include vision and motion detection and will go hand in hand with our existing Voice Kit. AIY Project kits will soon offer makers the "eyes," "ears," "voice" and sense of "balance" to allow simple yet powerful device interfaces.

We'd love to bake your input into our next releases. Go to or leave a comment to start up a conversation with us. Show us and the maker community what you're working on by using hashtag #AIYprojects on social media.


28th August 2017 |

Kaldi now offers TensorFlow integration
Posted by Raziel Alvarez, Staff Research Engineer at Google and Yishay Carmiel, Founder of IntelligentWire

Automatic speech recognition (ASR) has seen widespread adoption due to the recent proliferation of virtual personal assistants and advances in word recognition accuracy from the application of deep learning algorithms. Many speech recognition teams rely on Kaldi, a popular open-source speech recognition toolkit. We're announcing today that Kaldi now offers TensorFlow integration.

With this integration, speech recognition researchers and developers using Kaldi will be able to use TensorFlow to explore and deploy deep learning models in their Kaldi speech recognition pipelines. This will allow the Kaldi community to build even better and more powerful ASR systems as well as providing TensorFlow users with a path to explore ASR while drawing upon the experience of the large community of Kaldi developers.

Building an ASR system that can understand human speech in every language, accent, environment, and type of conversation is an extremely complex undertaking. A traditional ASR system can be seen as a processing pipeline with many separate modules, where each module operates on the output from the previous one. Raw audio data enters the pipeline at one end and a transcription of recognized speech emerges from the other. In the case of Kaldi, these ASR transcriptions are post processed in a variety of ways to support an increasing array of end-user applications.

Yishay Carmiel and Hainan Xu of Seattle-based IntelligentWire, who led the development of the integration between Kaldi and TensorFlow with support from the two teams, know this complexity first-hand. Their company has developed cloud software to bridge the gap between live phone conversations and business applications. Their goal is to let businesses analyze and act on the contents of the thousands of conversations their representatives have with customers in real-time and automatically handle tasks like data entry or responding to requests. IntelligentWire is currently focused on the contact center market, in which more than 22 million agents throughout the world spend 50 billion hours a year on the phone and about 25 billion hours interfacing with and operating various business applications.

For an ASR system to be useful in this context, it must not only deliver an accurate transcription but do so with very low latency in a way that can be scaled to support many thousands of concurrent conversations efficiently. In situations like this, recent advances in deep learning can help push technical limits, and TensorFlow can be very useful.

In the last few years, deep neural networks have been used to replace many existing ASR modules, resulting in significant gains in word recognition accuracy. These deep learning models typically require processing vast amounts of data at scale, which TensorFlow simplifies. However, several major challenges must still be overcome when developing production-grade ASR systems:

  • Algorithms - Deep learning algorithms give the best results when tailored to the task at hand, including the acoustic environment (e.g. noise), the specific language spoken, the range of vocabulary, etc. These algorithms are not always easy to adapt once deployed.
  • Data - Building an ASR system for different languages and different acoustic environments requires large quantities of multiple types of data. Such data may not always be available or may not be suitable for the use case.
  • Scale - ASR systems that can support massive amounts of usage and many languages typically consume large amounts of computational power.

One of the ASR system modules that exemplifies these challenges is the language model. Language models are a key part of most state-of-the-art ASR systems; they provide linguistic context that helps predict the proper sequence of words and distinguish between words that sound similar. With recent machine learning breakthroughs, speech recognition developers are now using language models based on deep learning, known as neural language models. In particular, recurrent neural language models have shown superior results over classic statistical approaches.

However, the training and deployment of neural language models is complicated and highly time-consuming. For IntelligentWire, the integration of TensorFlow into Kaldi has reduced the ASR development cycle by an order of magnitude. If a language model already exists in TensorFlow, then going from model to proof of concept can take days rather than weeks; for new models, the development time can be reduced from months to weeks. Deploying new TensorFlow models into production Kaldi pipelines is straightforward as well, providing big gains for anyone working directly with Kaldi as well as the promise of more intelligent ASR systems for everyone in the future.

Similarly, this integration provides TensorFlow developers with easy access to a robust ASR platform and the ability to incorporate existing speech processing pipelines, such as Kaldi's powerful acoustic model, into their machine learning applications. Kaldi modules that feed the training of a TensorFlow deep learning model can be swapped cleanly, facilitating exploration, and the same pipeline that is used in production can be reused to evaluate the quality of the model.

We hope this Kaldi-TensorFlow integration will bring these two vibrant open-source communities closer together and support a wide variety of new speech-based products and related research breakthroughs. To get started using Kaldi with TensorFlow, please check out the Kaldi repo and also take a look at an example for Kaldi setup running with TensorFlow.


16th August 2017 |

Hamilton App Takes the Stage
Posted by David DeRemer (from Posse)

Whether it's opening night for a Broadway musical or launch day for your app, both are thrilling times for everyone involved. Our agency, Posse, collaborated with Hamilton to design, build, and launch the official Hamilton app... in only three short months.

We decided to use Firebase, Google's mobile development platform, for our backend and infrastructure, while we used Flutter, a new UI toolkit for iOS and Android, for our front-end. In this post, we share how we did it.

The Cloud Where It Happens

We love to spend time designing beautiful UIs, testing new interactions, and iterating with clients, and we don't want to be distracted by setting up and maintaining servers. To stay focused on the app and our users, we implemented a full serverless architecture and made heavy use of Firebase.

A key feature of the app is the ticket lottery, which offers fans a chance to get tickets to the constantly sold-out Hamilton show. We used Cloud Functions for Firebase, and a data flow architecture we learned about at Google I/O, to coordinate the lottery workflow between the mobile app, custom business logic, and partner services.

For example, when someone enters the lottery, the app first writes data to specific nodes in Realtime Database and the database's security rules help to ensure that the data is valid. The write triggers a Cloud Function, which runs business logic and stores its result to a new node in the Realtime Database. The newly written result data is then pushed automatically to the app.

What'd I miss?

Because of Hamilton's intense fan following, we wanted to make sure that app users could get news the instant it was published. So we built a custom, web-based Content Management System (CMS) for the Hamilton team that used Firebase Realtime Database to store and retrieve data. The Realtime Database eliminated the need for a "pull to refresh" feature of the app. When new content is published via the CMS, the update is stored in Firebase Realtime Database and every app user automatically sees the update. No refresh, reload, or pull required!

Cloud Functions Left Us Satisfied

Besides powering our lottery integration, Cloud Functions was also extremely valuable in the creation of user profiles, sending push notifications, and our #HamCam — a custom Hamilton selfie and photo-taking experience. Cloud Functions resized the images, saved them in Cloud Storage, and then updated the database. By taking care of the infrastructure work of storing and managing the photos, Firebase freed us up to focus on making the camera fun and full of Hamilton style.

Developing UI? Don't Wait For It.

With only three months to design and deliver the app, we knew we needed to iterate quickly on the UX and UI. Flutter's hot reload development cycle meant we could make a change in our UI code and, in about a second, see the change reflected on our simulators and phones. No rebuilding, recompiling, or multi-second pauses required! Even the state of the app was preserved between hot reloads, making it very fast for us to iterate on the UI with our designers.

We used Flutter's reactive UI framework to implement Hamilton's iconic brand with custom UI elements. Flutter's "everything is a widget" approach made it easy for us to compose custom UIs from a rich set of building blocks provided by the framework. And, because Flutter runs on both iOS and Android, we were able to spend our time creating beautiful designs instead of porting the UI.

The FlutterFireproject helped us access Firebase Analytics, Firebase Authentication, and Realtime Database from the app code. And because Flutter is open source, and easy to extend, we even built a custom router library that helped us organize the app's UI code.

What comes next?

We enjoyed building the Hamilton app (find it on the Play Store or the App Store) in a way that allowed us to focus on our users and experiment with new app ideas and experiences. And based on our experience, we'd happily recommend serverless architectures with Firebase and customized UI designs with Flutter as powerful ways for you to save time building your app.

For us, we already have plans how to continue and develop Hamilton app in new ways, and can't wait to release those soon!

If you want to learn more about Firebase or Flutter, we recommend the Firebase docs, the Firebase channel on YouTube, and the Flutter website.


7th August 2017 |

TensorFlow Serving 1.0
Posted by Kiril Gorovoy, Software Engineer

We've come a long way since our initial open source release in February 2016 of TensorFlow Serving, a high performance serving system for machine learned models, designed for production environments. Today, we are happy to announce the release of TensorFlow Serving 1.0. Version 1.0 is built from TensorFlow head, and our future versions will be minor-version aligned with TensorFlow releases.

For a good overview of the system, watch Noah Fiedel's talk given at Google I/O 2017.

When we first announced the project, it was a set of libraries providing the core functionality to manage a model's lifecycle and serve inference requests. We later introduced a gRPC Model Server binary with a Predict API and an example of how to deploy it on Kubernetes. Since then, we've worked hard to expand its functionality to fit different use cases and to stabilize the API to meet the needs of users. Today there are over 800 projects within Google using TensorFlow Serving in production. We've battle tested the server and the API and have converged on a stable, robust, high-performance implementation.

We've listened to the open source community and are excited to offer a prebuilt binary available through apt-get install. Now, to get started using TensorFlow Serving, you can simply install and run without needing to spend time compiling. As always, a Docker container can still be used to install the server binary on non-Linux systems.

With this release, TensorFlow Serving is also officially deprecating and stopping support for the legacy SessionBundle model format. SavedModel, TensorFlow's model format introduced as part of TensorFlow 1.0 is now the officially supported format.

To get started, please check out the documentation for the project and our tutorial. Enjoy TensorFlow Serving 1.0!


2nd August 2017 |

Actions on Google is now available for British English
Posted by Brad Abrams, Product Manager

Starting today, we're making all your apps built for the Google Assistant available to our en-GB users across Google Home (recently launched in the UK), select Android phones and the iPhone.

While your apps will appear in the local directory automatically this week, to make your apps truly local, here are a couple of things you should do:

  • There are four new TTS voices with an en-GB accent. We've automatically selected one for your app but you can change the selected voice or opt to use your current en-US TTS voice by going to the actions console.
  • We also recommend reviewing all your response text strings and making adjustments to accommodate for differences between the two languages -- e.g., these pesky little Zeds. This will help make your app shine when accessed on the phone.

Apps like Akinator, Blinkist Minute and SongPophave already optimized their experience for en-GB Assistant users—and we can't wait to see who dives in next!

And for those of you who are excited about the ability to target Google Assistant users on en-GB, now it is the perfect time to start building. Our developer tools, documentationand simulatorhave all been updated to make it easy for you to create, test and deploy your first app.

We'll continue to make the Actions on Google platform available in more languages over the coming year. If you have questions about internationalization, please reach out to us on Stackoverflowand Google+.



26th July 2017 |

Apply to Google Developers Launchpad Studio for AI & ML focused startups
Posted by Roy Glasberg, Global Lead, Google Developers Launchpad

The mission of Google Developers Launchpad is to enable startups from around the world to build great companies. In the last 4 years, we've learned a lot while supporting early and late-stage founders. From working with dynamic startups---such as teams applying Artificial Intelligence technology to solving transportation problems in Israel, improving tele-medicine in Brazil, and optimizing online retail in India---we've learned that these startups require specialized services to help them scale.

So today, we're launching a new initiative - Google Developers Launchpad Studio - a full-service studio that provides tailored technical and product support to Artificial Intelligence & Machine Learning startups, all in one place.

Whether you're a 3-person team or an established post-Series B startup applying AI/ML to your product offering, we want to start connecting with you.

Applications to join Launchpad Studio are now open and you can apply here.

The global headquarters of Launchpad Studio will be based in San Francisco at Launchpad Space, with events and activities taking place in Tel Aviv and New York. We plan to expand our activities and events to Toronto, London, Bangalore, and Singapore soon.

As a member of the Studio program, you'll find services tailored to your startups' unique needs and challenges such as:

  • Applied AI integration toolkits: Datasets, testing environments, rapid prototyping, simulation tools, and architecture troubleshooting.
  • Product validation support: Industry-specific proof of concept and pilots, as well as use case workshops with Fortune 500 industry practitioners and other experts.
  • Access to AI experts: Best practice advice from our global community of AI thought leaders, which includes Peter Norvig, Dan Ariely, Yossi MatiasChris DiBonaand more.
  • Access to AI practitioners and investors: Interaction with some of the best AI and ML engineers, product managers, industry leaders and VCs from Google, Silicon Valley, and other international locations.

We're looking forward to working closely with you in the AI & Machine Learning space, soon!

"Innovation is open to everyone, worldwide. With this global program we now have an important opportunity to support entrepreneurs everywhere in the world who are aiming to use AI to solve the biggest challenges." Yossi Matias, VP of Engineering, Google


19th July 2017 |

New security protections to reduce risk from unverified apps
Originally posted by Naveen Agarwal, Identity team and Wesley Chun (@wescpy), Developer Advocate, G Suite on the G Suite Developers Blog

We're constantly working to secure our users and their data. Earlier this year, we detailed some of our latest anti-phishing tools and rolled-out developer-focused updates to our app publishing processes, risk assessment systems, and user-facing consent pages. Most recently, we introduced OAuth apps whitelisting in G Suite to enable admins to choose exactly which third-party apps can access user data.

Over the past few months, we've required that some new web applications go through a verification process prior to launch based upon a dynamic risk assessment.

Today, we're expanding upon that foundation, and introducing additional protections: bolder warnings to inform users about newly created web apps and Apps Scripts that are pending verification. Additionally, the changes we're making will improve the developer experience. In the coming months, we will begin expanding the verification process and the new warnings to existing apps as well.

Protecting against unverified apps

Beginning today, we're rolling out an "unverified app" screen for newly created web applications and Apps Scripts that require verification. This new screen replaces the "error" page that developers and users of unverified web apps receive today.

The "unverified app" screen precedes the permissions consent screen for the app and lets potential users know that the app has yet to be verified. This will help reduce the risk of user data being phished by bad actors.

The "unverified app" consent flow

This new notice will also help developers test their apps more easily. Since users can choose to acknowledge the 'unverified app' alert, developers can now test their applications without having to go through the OAuth client verification process first (see our earlier post for details).

Developers can follow the steps laid out in this help center article to begin the verification process to remove the interstitial and prepare your app for launch.

Extending security protections to Google Apps Script

We're also extending these same protections to Apps Script. Beginning this week, new Apps Scripts requesting OAuth access to data from consumers or from users in other domains may also see the "unverified app" screen. For more information about how these changes affect Apps Script developers and users, see the verification documentation page.

Apps Script is proactively protecting users from abusive apps in other ways as well. Users will see new cautionary language reminding them to "consider whether you trust" an application before granting OAuth access, as well as a banner identifying web pages and forms created by other users.

Updated Apps Script pre-OAuth alert with cautionary language
Apps Script user-generated content banner

Extending protections to existing apps

In the coming months, we will continue to enhance user protections by extending the verification process beyond newly created apps, to existing apps as well. As a part of this expansion, developers of some current apps may be required to go through the verification flow.

To help ensure a smooth transition, we recommend developers verify that their contact information is up-to-date. In the Google Cloud Console, developers should ensure that the appropriate and monitored accounts are granted either the project owner or billing account admin IAM role. For help with granting IAM roles, see this help center article.

In the API manager, developers should ensure that their OAuth consent screen configuration is accurate and up-to-date. For help with configuring the consent screen, see this help center article.

We're committed to fostering a healthy ecosystem for both users and developers. These new notices will inform users automatically if they may be at risk, enabling them to make informed decisions to keep their information safe, and will make it easier to test and develop apps for developers.


14th July 2017 |

Google Developer Days are coming to Europe
Posted by Jason Titus, Vice President, Developer Product Group

I'm happy to share that we opened registrations for the European installment of our global event series — Google Developer Days (GDD). Google Developer Days showcase our latest developer product and platform updates to help you develop high quality apps, grow & retain an active user base, and tap into tools to earn more.

Google Developer Days — Europe (GDD Europe) will take place on September 5-6 2017, in Krakow, Poland. We'll feature technical talks on a range of products including Android, the Mobile Web, Firebase, Cloud, Machine Learning, and IoT. In addition, we'll offer opportunities for you to join hands-on training sessions, and 1:1 time with Googlers and members of our Google Developers Experts community. We're looking forward to meeting you face-to-face so we can better understand your needs and improve our offerings for you.

If you're interested in joining us at GDD Europe, registration is now open.

Can't make it to Krakow? We've got you covered. All talks will be livestreamed on the Google Developers YouTube channel, and session recordings will be available there after the event. Looking to tune into the action with developers in your own neighborhood? Consider joining a GDD Extended event or organizing one for your local developer community .

Whether you're planning to join us in-person or remotely, stay up-to-date on the latest announcements using #GDDEurope on Twitter, Facebook, and Google+.

We're looking forward to seeing you in Europe soon!


30th June 2017 |

Modifying events with the Google Calendar API
Originally posted by Wesley Chun (@wescpy), Developer Advocate, G Suite, on the G Suite Developers Blog.

You might be using the Google Calendar API, or alternatively email markup, to insert events into your users' calendars. Thankfully, these tools allow your apps to do this seamlessly and automatically, which saves your users a lot of time. But what happens if plans change? You need your apps to also be able to modify an event.

While email markup does support this update, it's limited in what it can do, so in today's video, we'll show you how to modify events with the Calendar API. We'll also show you how to create repeating events. Check it out:

Imagine a potential customer being interested in your product, so you set up one or two meetings with them. As their interest grows, they request regularly-scheduled syncs as your product makes their short list—your CRM should be able to make these adjustments in your calendar without much work on your part. Similarly, a "dinner with friends" event can go from a "rain check" to a bi-monthly dining experience with friends you've grown closer to. Both of these events can be updated with a JSON request payload like what you see below to adjust the date and make it repeating:

var TIMEZONE = "America/Los_Angeles";
var EVENT = {
"start": {"dateTime": "2017-07-01T19:00:00", "timeZone": TIMEZONE},
"end": {"dateTime": "2017-07-01T22:00:00", "timeZone": TIMEZONE},
"recurrence": ["RRULE:FREQ=MONTHLY;INTERVAL=2;UNTIL=20171231"]

This event can then be updated with a single call to the Calendar API's events().patch() method, which in Python would look like the following given the request data above, GCAL as the API service endpoint, and a valid EVENT_ID to update:'primary', eventId=EVENT_ID,
sendNotifications=True, body=EVENT).execute()

If you want to dive deeper into the code sample, check out this blog post. Also, if you missed it, check out this video that shows how you can insert events into Google Calendar as well as the official API documentation. Finally, if you have a Google Apps Script app, you can access Google Calendar programmatically with its Calendar service.

We hope you can use this information to enhance your apps to give your users an even better and timely experience.


28th June 2017 |

Experimenting with VR Ad formats at Area 120
Posted by Aayush Upadhyay and Neel Rao, Area 120

At Area 120, Google's internal workshop for experimental ideas, we're working on early-stage projects and quickly iterate to test concepts. We heard from developers that they're looking at how to make money to fund their VR applications, so we started experimenting with what a native, mobile VR ad format might look like.

Developers and users have told us they want to avoid disruptive, hard-to-implement ad experiences in VR. So our first idea for a potential format presents a cube to users, with the option to engage with it and then see a video ad. By tapping on the cube or gazing at it for a few seconds, the cube opens a video player where the user can watch, and then easily close, the video. Here's how it works:

Our work focuses on a few key principles - VR ad formats should be easy for developers to implement, native to VR, flexible enough to customize, and useful and non-intrusive for users. Our Area 120 team has seen some encouraging results with a few test partners, and would love to work with the developer community as this work evolves - across Cardboard (on Android and iOS), Daydream and Samsung Gear VR.

If you're a VR developer (or want to be one) and are interested in testing this format with us, please fill out this form to apply for our early access program. We have an early-stage SDK available and you can get up and running easily. We're excited to continue experimenting with this format and hope you'll join us for the ride!


20th June 2017 |

Brotli Compression in Google Display Ads
Posted by Michael Burns, Software Engineer, Publisher Tagging & Ads Latency Team

Our goal is to help publishers monetize their content and build sustainable businesses through advertising products that allow sites to load as fast as possible to minimize impact to user experience.

Almost two years ago, our compression team announced a new compression algorithm called Brotli. Today, we are happy to announce that the Brotli compression algorithm is now being used to compress Google Display Ads whenever possible. In our experiments, we see data savings of 15% in aggregate over standard gzip compression, and in some instances, a savings of over 40%! This reduces the amount of data sent to end users by tens of thousands of gigabytes every day! This also results in faster page loads and less battery consumption.

We hope results like this will encourage wider adoption and will advance web standards such as Brotli compression.


12th June 2017 |

Introducing Team Drives for developers
Originally posted by Hodie Meyers, Product Manager, Google Drive, and Wesley Chun (@wescpy), Developer Advocate, G Suite on the G Suite Developers Blog

Enterprises are always looking for ways to operate more efficiently, and equipping developers with the right tools can make a difference. We launched Team Drives this year to bring the best of what users love about Drive to enterprise teams. We also updated the Google Drive API, so that developers can leverage Team Drives in the apps they build.

In this latest G Suite Dev Show video, we cover how you can leverage the functionality of Team Drives in your apps. The good news is you don't have to learn a completely new API—Team Drives features are built into the Drive API so you can build on what you already know. Check it out:

By the end of this video, you'll be familiar with four basic operations to help you build Team Drives functionality right into your apps:

  1. How to create Team Drives
  2. How to add members/users to your Team Drives
  3. How to create folders in Team Drives (just like creating a regular Drive folder)
  4. How to upload/import files to Team Drives folders (just like uploading files to regular folders)

Want to explore the code further? Check out the deep dive blog post. In all, the Drive API can help a variety of developers create solutions that work with both Google Drive and Team Drives. Whether you're an Independent Software Vendor (ISV), System Integrator (SI) or work in IT, there are many ways to use the Drive API to enhance productivity, help your company migrate to G Suite, or build tools to automate workflows.

Team Drives features are available in both Drive API v2 and v3, and more details can be found in the Drive API documentation. We look forward to seeing what you build with Team Drives!


9th June 2017 |

Introducing Blockly 1.0 for Android and iOS
Posted by Erik Pasternak and the Kids Coding Team

Over the past five years, developers have created hundreds of projects with Blockly, our open source library for creating block-based coding experiences. These have ranged from education platforms like to electronics kits like littleBits and even Android app creation tools like MIT App Inventor. Last year, we also announced our collaboration with the Scratch Team to develop Scratch Blocks—a fork of Blockly optimized for creating coding apps for kids.

Today, we're finalizing our 1.0 release of Blockly on Android and iOS. These versions have everything you need to use Blockly natively in your mobile app, including:

  • Blockly's standard UI
  • Custom blocks, toolbox categories, and layouts
  • Functions, variables, mutators, and extensions
  • Code generation in JavaScript, Python, Dart, PHP, and Lua
  • Internationalization support (including for RTL languages)

While our 1.0 update today is focused on native mobile, we've also made several updates to the web project over the past six months. We've made major improvements to performance and testing, added more structured APIs, and improved touch support for the mobile web. In addition, we improved Internet Explorer and Edge support; Blockly is fully supported on IE10+.

We've done a lot of work to ease cross platform development, too! All blocks can now be defined by JSON, allowing a single set of block definitions to be used for web, iOS, and Android. Check out the documentationfor more details on all three platforms.

Get started right away with our iOS Codelab (Android coming soon)! To learn more about Blockly, check out the above intro video, visit our developer site, join our mailing list, or jump right into the code for web, Android, or iOS.