Google Data Talk: January & February Updates

Data is the new gold, we all know that! The first two months of this year, Google already announced quite some game-changing updates. Curious? Well, you came to the right place! We gathered our favourite news from January and February in this blog! 

Taking the next step for semi-structured data in Big Query

In early January, Google announced the public preview for JSON data BigQuery, giving you support for storing and analysing semi-structured data in BigQuery.

With this new JSON storage type (advanced JSON features and new JSON functions) semi-structured data in BigQuery can now be used intuitively and queried in its native format.

You can register for the preview here

   Read more: Announcing preview of BigQuery’s native support for semi-structured data

New audit tool in Cloud SQL for MySQL

If you manage sensitive data in your MySQL database, you are probably obliged to capture and monitor your user database. Although you could set up MySQL’s slow query log or general log to create an audit trail of user activity, those logs significantly impact database performance and aren’t formatted optimally for auditing. Purpose-built, open source audit plugins are better, but they lack some of the advanced security features that enterprise users need, such as rule-based auditing and masked results.

That’s why Google has developed a new audit plug-in: Cloud SQL for MySQL Audit Plugin. It offers a database audit on enterprise-level. Organisations can define control rules that determine which database activities get registered. It also masks sensitive information, like user passwords, while doing so. Those logs are sent to Cloud Logging where they can be analysed to gain insights into the database operations.

Read more: Keep tabs on your tables: Cloud SQL for MySQL launches database auditing

Cloud Bigtable implements autoscaling

Cloud Bigtable is Google’s fully managed, scalable NoSQL database service, intended for large operational and analytical workloads. Today, Bigtable manages more than 10 Exabytes of data and processes more than 5 billion requests per second at peak times. In January, Google launched Autoscaling for Bigtable, which automatically adds or removes capacity in response to changing demand in your applications. With automatic scaling, you only pay for what you need while simplifying your management!


In addition to autoscaling, Google launched a series of other innovations in Bigtable: 

  • Doubling the storage limit;
  • The availability of cluster groups, giving you the flexibility to determine how you route your application traffic; 
  • More detailed usage stats, which enable better insights, faster implementation of problem-shooting solutions and better workload management. 

Read more: Cloud Bigtable launches Autoscaling plus new features for optimising costs and improving manageability

Explainable AI now available in Big Query

An ML-platform is often a block box; a big mystery. You start with a lot of data, somewhere in the system the data gets processed, and the platform generates results. Who knows what happens in between? You’re right, nobody. That’s a problem because without proper control, you can never know how reliable your result is. 

As always, Google has the answer! With  Explainable AI (XAI), Google gives you the tools to help interpret and understand how ML-models come to certain decisions. It is becoming an absolute must for organisations when they keep on using ML solutions. That’s why Google is expanding the use of Explainable AI in all of their solutions; the next on the list was BigQuery. 

Read the details here: BigQuery Explainable AI now in GA to help you interpret your machine learning models

Serverless Spark, an industry first for Google!

Most companies use Apache Spark today to develop their use cases for data engineering, data exploration and machine learning. The speed, simplicity and flexible use of programming languages are the main differentiators of the product. Today, however, the management of clusters and alignment of infrastructure has been inefficient. There is no integrated experience for different use cases, which leads to productivity losses and increased governance risks, and it reduces the value of using Spark as a business. 

All hail Google, because since February Serverless Spark, the industry’s first autoscaling serverless Spark, is available on GCP. Google already announced that they are releasing Spark via Bigquery in preview as well. This allows BigQuery users to deploy serverless Spark along with BigQuery SQL for their data analysis.

Read more: Simplify data processing and data science jobs with Serverless Spark, now available on Google Cloud

Extending the use of proprietary encryption keys in Data Fusion

For those responsible for the management of Cloud SQL instances, the security of your database should be one of your top priorities. As the number of users grows, it can become quite a challenge.  More Cloud SQL users means more work to ensure the right users have the right access to the right database instances. Working with separate projects was a solution, but as the number of projects increased, management became more complex. 

Those issues should be resolved with the rollout of the new Cloud SQL support for IAM conditions. IAM conditions are part of Cloud Identity and Access Management (IAM); the service on which Cloud SQL and other Google Cloud services rely to determine access to cloud resources. Since February, you can add a condition that describes the circumstances in which a
person should have access to a Cloud SQL role. With IAM conditions you can now authorise actions based on different characteristics.

In addition, Cloud SQL support for Tags became generally available in February. With the help of tags you can organise and control access to other Google Cloud Resources! You can manage tags through Resource Manager and refer to tags in IAM policy bindings to grant conditional access to resources with those tag bindings. Cloud SQL instances can either inherit a tag from the project or folder they reside in, or be tagged directly.

You can use IAM conditions and tags together to control admin and connection access to Cloud SQL more securely and easily.

Find out more: Cloud SQL launches support for IAM Conditions and Tags

Simplify real-time message processing with new API

In event-driven environments, companies are often swamped with all kinds of messages. From customers asking questions to sensors reporting status changes. Using that data for real-time forecasting and anomaly detection is key to bringing customer service to a next level and to trigger the right actions in a timely manner. This type of real-time analysis often requires tailor-made solutions, though, leaving companies with expensive and complex systems. 

To help businesses leverage their data more easily and effectively, Google announced the Timeseries Insights API: a highly efficient and scalable system that makes it easier to quickly gather insights from streamed time series. The Timeseries Insights API is fully integrated with Google Cloud Storage and Google Cloud PubSub, allowing it to process datasets with trillions of events.

Read more: Introducing Timeseries Insights API: real-time forecasting and anomaly detection over trillions of events

Well, that’s it. If you want to stay up to date, we recommend you subscribe to our newsletter to find those updates directly in your mailbox!  

Competence Center:

Bart Gouweloose

5 min

Related content

Want to read some more?

Want to stay in the loop?

Subscribe to our newsletter and join our community of Google Cloud enthusiasts! With our newsletter, we want to cut through the noise, delivering inspiring success stories and valuable insights on all things Google by Cronos. It is our goal to keep you informed without overwhelming your inbox. On average, you can expect to hear from us once a month.