Software, Mechanical & Systems Engineering Blog | 6 River Systems

How we use Google BigQuery, Cloud and Colab for Calibration Optimization and Training Machine Learning Models

Information — Tue, 23 Feb 2021 17:26:46 +0000

Camera calibration: How 6RS knows robots are seeing straight

For an autonomous mobile robot, maintaining accuracy within the visual cameras is vital to its operation and requires regular and reliable calibration systems. In real world robotics operations, the position of the RGBD cameras (given by X, Y, Z, Yaw, Pitch and Roll) tends to change: mechanical distortions, varying load on the robot, rough terrain, even inconsistencies in manufacturing contribute to tiny changes in the camera’s function. Over time, these slight changes may result in incorrect obstacle estimation and detection, resulting in false alarms and unnecessary stops that contribute to a slower average speed, consequently slowing down the whole operation.

At 6 River Systems, we mitigate calibration drift by running an auto-calibration protocol whenever the robot travels to the auto-charger. However, if the amount of drift becomes too great, a reliability engineer must intervene by remotely logging in and manually calibrating the camera. This process used to take approximately two hours during which the robot had to be pulled out of commission, resulting in productivity loss for both the customer (the out-of-operation robot) and the engineer (scheduling and completing a two-hour manual intervention). For our engineering team and for our customers, the time required to fix out-of-calibration cameras was a significant pain point.

To solve this problem, we developed an interactive tool using Google Colab which can be used by anyone on the reliability or operations teams. The whole process is completed offline and any adjustments are wirelessly updated on the robot in just a few minutes without halting the robot’s operation at the customer site. Because the tool does not interact with the robot until the last step of updating calibration values, it can be used if the robot is disconnected from its network connection or even turned off.

How? Data snapshots

If the tool does not interact with the robot at all, where do the data (point clouds) come from? We at 6RS perception team have developed a system called Snapshots (which deserves a blog post in itself): an event-driven data buffering and recording software module.

Imagine you are diagnosing and solving a recurring problem with a robot. You would want to capture and analyze data from when the problem occurred as well as what happened just before the issue manifested. We seek to do this whenever working on diagnostics, improvement and enhancements in a range of algorithms. Last year we developed the Snapshot module to achieve the data buffering and recording for any configured event(s).

In the case of calibration, we configured our snapshot system to record point cloud and depth image data for a few seconds as the robot prepares to dock with the auto-charger. As expected, every time a robot approaches the charger, a ROSbag is created on the robot’s disk and uploaded into the cloud, facilitating the very offline calibration optimization using the tool we are discussing in the post.

How the calibration tool works

User selects parameters like date, site, robot, etc. and the tool downloads the data accordingly. All the drop downs and fields are updated dynamically depending upon the preceding selection. The whole user experience is like using a web application but written entirely in Python with the help of “form” functionality in Google Colab and IPython widgets.

As we described in a previous blog post, the ROSbags containing calibration offset analytics are stored in Google BigQuery while the point cloud data is recorded and uploaded into the cloud by the Snapshot module.

In the Google Colab environment, we can triage any robot’s calibration status from any of our customer sites by accessing their most recent Snapshot. This information is fetched using Google’s Python API for BigQuery. Upon selecting a particular robot from the dropdown input, BigQuery fetches calibration data and Google Colab downloads the ROSbag associated with that instance of calibration.

User adjusts the calibration offsets using the respective fields and then calculates the transformation followed by visualizing the point cloud at one click. Matplotlib was used to visualize the point cloud messages from ROSbags by converting the messages to NumPy arrays.

Once all of the data are fetched and parsed, the engineer can review a visual representation of what the robot “sees” as well as a table of its camera’s calibration values. By adjusting the UI elements in the calibration tool, the visual representation of points adjust to illustrate the new values. Using this process, a user can make additional corrections to offsets while visualizing the point clouds.

Optimization and smart offsets suggestions

GIF showing 3D interactive visualization of the point cloud inside the notebook.

We analyzed a variety of point clouds collected from the snapshot bags and devised some performance metrics which are the indication of the calibration offsets being good enough to result in smooth movement. Using the data from the robots with ideal movement, we came up with optimum values for these performance metrics. Once the optimizer goes over all the point clouds in the ROSbag, the calibration tool suggests optimal manual offsets using this information for the robot’s camera as well as a visual representation of its adjustment.

Uploading calibration data back to the robot

After the calibration work is complete, the file is delivered to the robot when it connects to the network to receive its next instruction.

Abstraction of details in the tool

All the UI elements were developed using either IPython widgets library or Google Colab’s form elements. No code is visible to the end user and most of the python functions are contained in a python file which is downloaded from the cloud when the tool starts. Dropdowns, sliders and input boxes were used in the simplest way to help anyone adjust the calibration of any robot in the fleet given that a snapshot bag is available in the cloud.

Workflow

Other applications

Using the similar setup of integrating Google BigQuery, Cloud Storage and Colab, we have developed data annotation and machine learning model training tools. For example, a ROSbag is parsed to collect all the images inside it and presented to the user for data annotation. The user can annotate the data using UI elements and finally export the data and labels with the click of a button.

Since the analytics and ROSbags are stored in the cloud regularly by the robots, leveraging the tools developed in Colab have helped us in data analysis, model training and testing without manually searching for the data.

Additional improvements

Following the calibration optimization tool, a tool was developed to optimize the calibration of all the robots on a given site and generate a pdf report with the suggested calibration offsets and the plot of the point clouds transformed using the suggested offsets, which has saved a lot of support time and prevented downtimes for the robots.

Fully automated in-cloud optimization is in progress which would not need any manual intervention. Various metrics will be regularly monitored and the values will be adjusted to ensure the optimal performance.

Conclusion

At 6 River Systems, we use diverse thinking to apply solutions to multiple problems. We use open source tools like Google Cloud and Cloud Platform, and we take the time to develop intelligent, impactful solutions. If this is exciting to you, check out our careers page: 6river.com/jobs

About the Author

Arpit Gupta is a Robotics software engineer working in the perception team building safety and perception features to make the robot safer and reliable.

Arpit has been working with 6RS for one and a half years since graduating with a Master’s degree in Robotics at Worcester Polytechnic Institute (WPI).

The post How we use Google BigQuery, Cloud and Colab for Calibration Optimization and Training Machine Learning Models appeared first on 6 River Systems.

How to Build a VR Offsite

Information — Mon, 08 Feb 2021 17:51:51 +0000

Your typical Offsite is a fairly standard affair. It serves as a way to build camaraderie and teamwork. For my team, Technical Operations, they were a great opportunity to meet new members and share new ideas, not to mention take advantage of a hotel-catered breakfast buffet. 2020 changed all of that.

B.C. – Before Coronavirus

In February 2020, I was asked by Eran Frenkel, the manager of TechOps to plan our sixth annual Offsite for the team. I asked Alfred Larsson, who planned last year’s Offsite, for his thoughts and recommendations. He shared a lot of useful data about what people would like to see at the next event. I researched team-building activities and relevant topics. My Offsite, like so many before, I thought, would follow a simple agenda consisting of a leadership presentation, some team-building games, and a couple breakout/brainstorming sessions.

But then the world ended. The COVID-19 pandemic forced Shopify and 6 River Systems to adopt a work-from-home policy and restricted nonessential travel. Each Riv was given a box to take some desk stuff home and the rest was packed up and stored in what was meant to be our new office building. Fortunately, I was able to adapt my normal work to this new Digital-by-Default lifestyle, but we couldn’t very well fly everyone to a hotel conference room during a public health crisis. I figured I’d have to postpone the offsite and scrap all my plans.

Instead, I set out to “Find a better way”, one of 6 River Systems’ core values. What if our Offsite was held in Virtual Reality? I had just purchased my own VR system and I loved the feeling of being so immersed in VR games. I brought the idea to Eran; Rivs could access the Offsite, hosted entirely in a VR environment, from the comfort of their homes. He was excited about the idea, but concerned about how we would pull it off.

Okay, but how?

For a couple weeks, most of my days were spent researching existing methodologies and VR event companies. Based on previous Offsites, I figured a budget of approximately $5K was reasonable. Boy, was I wrong. Most of the quotes I received were over $12K for the base event alone, not including hardware. So I decided to put more energy into something a little more scrappy and a little more thoughtful: developing our own VR meeting space.

Now, I’m not a videogame developer. I spent most of school building real, physical robots and I was in no way prepared to create something from scratch. I needed a base program or SDK (source developer kit) to start with. In VRChat, I found both!

VRChat is a game that rose to popularity with the advent of standard VR hardware in 2017. It is supported not only by Oculus and SteamVR, two prime VR franchises, but by standalone Windows PCs as well. This broad platform base made it an ideal starting point for Rivs stuck at home as well as Deployment Engineers who would be traveling at the time of the Offsite.

I found extensive documentation including tutorials, wikis, and example code for developing in VRChat. After finalizing presentation topics and themes, I set to work designing the actual VR world. One of the best things about developing in VR is that there are nearly endless possibilities for environments and activities. Participants could teleport anywhere, lift objects 100 times their size, even fly! However, seeing as this would be many Rivs’ first time with VR, I figured we should start simpler. I designed a familiar conference environment: a series of 4 rooms arranged around a central hub (pictured above). These rooms would serve as the spaces for three team-building activities, as well as an auditorium for a presentation from Eran and Rylan Hamilton, co-founder and co-ceo of 6RS. (pictured below).

Eventually, this monolithic world would be broken up into several smaller, more manageable worlds. We were shipping Oculus Quests for participants to rent and they had a max world size of about 40-50 mb. Quests were certainly a cheaper, more-portable alternative to some of the $800 VR hardware out there. The smaller world size would also help me keep Rivs organized.

Immerse v1.0

The all-day event started with all users meeting in the Hub world, and then taking a world portal to the auditorium, where Rylan Hamilton and Eran Frenkel gave a leadership presentation. After a non-VR breakout session, we all rejoined the Hub for transport to the first of three team-building activities. Rinse and repeat. Knowing firsthand the importance of having a break between VR sessions that last an hour or longer, we never scheduled back-to-back VR activities. One participant did complain about motion sickness about halfway through the event, but was able to log back in on their PC for a less disorienting experience.

While, generally, the event was a success, this first Immerse Offsite was not without its issues. During the presentation, some users complained about slides not being synced across everyone’s systems. In the first team-building activity, some users noticed latency and low frame rates as they struggled to build brightly colored blocks into a bridge (pictured below). In the last activity, what was intended to be a virtual handwashing competition instead became an Oculus death sentence as nearly every user’s headset crashed due to some infinite-spawn bugs.

Immerse II: The Immersening

Despite all these issues, the event was a big hit. Everyone wanted an Immerse II, and next time I couldn’t coast on its novelty alone. About seven months later, Eran asked me to begin planning another event, and this time, I was to select a partner who would focus on the logistics and theme of the Offsite. This would free me to fine-tune the technical aspects of the event. I also took triple the amount of prep time in order to test and debug the VR worlds.

As you might imagine, with ample prep time and my stellar partner, Jess Podoloff, the second Immerse was much more polished. I was able to devote nearly all my time to designing team-building activities and migrating slides to VR, while Jess commissioned breakout content and ordered headsets. She even shipped every participant a custom TechOps Immerse jacket!

We asked three other coworkers to serve as “chaperones”. My own subteam, CI-Ops, (Jamie Hall, Lucy Stuehrmann, and Julio Salazar) served as group monitors as we divided all participants among the three team-building activities. Each activity was hosted in a unique VR world and participants were split into three groups and swapped worlds each activity slot. Rotating out slots in this way, we were able to cap the number of Rivs at 12 in each world at any given time. This drastically improved performance, and helped offset some of the work needed to wrangle everyone. Strangely enough, when you give 40 people VR headsets, they spend a lot of time ooh-ing and aah-ing loudly while you try to get them organized.

But once we got them in the Team-Building competitions, they were laser-focused on winning. The new activities were wildly popular! First, there was a design exercise in which teams competed to construct a track to roll a ball as far as possible, with the fewest number of pieces (pictured below). The next was a picking relay that rapidly dissolved into chaos. And last, the most popular, was a race through a maze held under a giant Chuck!

All in all, the program has come a long way in the past year. The TechOps team asked that we make Immerse an annual event and I’m excited to brainstorm new ways of bringing my team together from the comfort of their own homes. I’m immensely grateful to work at a company who values team-building and cooperation as well as my desire to innovate. I’ve already started thinking about Immerse III and have gotten lots of inquiries from teammates willing to help out!

Want to join this innovative team while working from home? Check us out at 6river.com/jobs

About the Author

Mathew Schwartzman works on the CI-Ops team managing Customer Integration Testing and building custom tools for 6 River Systems.

Mathew has been with 6 River Systems for close to 2 years. When he’s not building VR Worlds he enjoys IOT projects, theatre/film production, and the outdoors!

The post How to Build a VR Offsite appeared first on 6 River Systems.

How 6 River Systems is Solving the Challenges of Traditional Zoning Methods

Chris Cacioppo, CTO and co-Founder — Thu, 21 Jan 2021 23:13:16 +0000

Ask any pick-and-pack warehouse operator: there are pros and cons for each of the picking methodologies they choose for their fulfillment operations. Zone picking is ideal for very large operations where pickers must travel long distances to complete an order or for smaller sites with very high order volume where congestion is an issue. However, a common challenge associated with zone picking is predicting where your associates will need to be based on inventory slotting and the order pool for the day. Incorrectly predicting zones results in overworking some associates while others wait around with nothing to do.

Armed with knowledge of this challenge, we tackled the question: what if you could reap the benefits of zone picking without the challenge of analyzing inventory slotting, consumer demand and labor availability? Our solution: dynamic zoning.

First, let’s understand the methodology we consider behind using static zones within a warehouse.

Why are zones used?

In order to meet SLAs and maintain profitability, there is a constant pressure to keep a fulfillment operation running efficiently. Enabling associates to spend more time picking products and less time walking from one place to another is an area ripe for optimization. Although collaborative mobile robots eliminate long, unnecessary walks to deliver and receive work, they do not remove walking between picks. One technique to alleviate unnecessary walking is to break the picking area into zones and assign associates to stay within them. This can be especially effective if there are sparse picks over a large area.

What are the challenges of using zones?

There are two interrelated problems that often cause zones to be less effective than operators hope: defining zone areas and labor balancing.

Zone areas are set with the intent to evenly distribute work and ideally put common clusters of work in one area. The problem is that it is extraordinarily difficult to predict commonly clustered SKUs. Order profiles are constantly changing over time – from season to season and even over the course of the day. We often see warehouses that have been broken into zones based on physical size and then never adjusted again.

The second related problem is labor balancing. Over the course of a shift, an operator wants their associates to spend as much time as possible actively picking rather than traveling from one pick to the next or, even worse, idle. So, when utilizing a zone picking method, it’s best when the orders are evenly distributed across all zones so each associate is engaged. Unfortunately, this rarely happens – there are often drastically different work loads across zones. To mitigate this, managers need to constantly monitor order volumes and make adjustments to zone assignments on the fly. Although we have some operations that want to use zones, most of our operations decided that there were too many inefficiencies and overhead managing them to make them beneficial.

The Solution: Dynamic Zones

Analyzing our customers’ needs, we have devised an innovative solution that reaps all of the benefits of traditional zones with none of the drawbacks. We call this dynamic zoning; here’s how it works:

When an associate completes a pick and is preparing to travel to the next, 6 River Systems’ intelligent allocation system calculates how much time it will take for the associate to get to the new pick. It also calculates how long it will take the current associate to meet another robot and a new associate to meet the current robot. If it ever takes less time to do this handoff, it is performed and time is saved. It has exactly the same benefit as if a perfect zone was created for this batch of orders.

Figuring out ideal zone boundaries is no longer a problem; there are no fixed zone boundaries! The 6RS software effectively determines the perfect zone for each robot without any operator intervention.

There is no need for labor balancing – we do that automatically. With dynamic zones, there are never associates waiting for work. As soon as an associate completes work with a robot, it leaves for the new “zone” and the associate can meet the next waiting robot. We have already determined there is one that is available for them.

Summary

Traditional static zones can be an effective technique to increase productivity and it is a methodology that 6 River Systems has supported for years. However, it does have its shortcomings so we designed a superior solution: dynamic zoning, which is just one of many innovations that we offer to our customers.

The post How 6 River Systems is Solving the Challenges of Traditional Zoning Methods appeared first on 6 River Systems.

What are the advantages of system-directed robots vs swarming robots?

Chris Cacioppo, CTO and co-Founder — Wed, 13 Jan 2021 04:00:50 +0000

A question that often arises when teams research warehouse robots is understanding the differences between system-directed and swarm robots. Both of these technologies are types of collaborative mobile robots, ones that work closely alongside your associates. These types of systems attempt to increase productivity rates compared to manual cart picking by reducing unnecessary walking and manual labor, speeding up workflows and reducing training time. Both approaches can provide benefits over cart-based picking, but there are significant differences as well.

Before comparing and contrasting methodologies, let’s define each robotic approach.

System-directed robots

At its simplest, an associate meets a robot and is directed from pick to pick. This can be broken into traditional zones or utilize a more advanced technology like dynamic zoning, but the associates are generally performing a number of sequential picks with the same robot. This technology is typically used for order picks, batch picking, cycle counting and replenishment tasks. For ease of reference, we will refer to this method as “directed picking” throughout this piece.

Swarming robots

Swarming robots generally go to a specific pick location and wait for an associate to perform work. Once the work is completed, the swarm robot travels to the next pick location and the associate finds another swarm robot that has available work. This technology is typically used for order picks, cycle counting and replenishment tasks. For ease of reference, we will refer to this method as “swarming.”

When is it best to use directed picking over swarming?

The two robotic approaches exist because there are different circumstances when one is more effective than the other. Swarming robots can be effective in some operations, particularly in small, dense picking operations or ones with wide-aisle pallet picking. However, as an operation becomes more sophisticated, so do the number of significant advantages of directed picking robots. These include better scaling to large operations, better worker productivity and higher density warehouses.

Operation Scaling

In any operation, there is generally a direct relationship between the square footage of a warehouse and the number of pickers on the floor. So, as warehouses increase in size, you will typically see more pickers. However, the relationship of space to pickers and space to robots are not the same between directed picking and swarming designs.

With directed picking robots, the quantities of robots are indicated by the number of pickers at your operation. Directed picking robots typically use 1.5 robots for each picking associate which ensures that as robots autonomously travel to and from work areas within a warehouse, each picker on the floor is still engaged in work with a different robot. So, for example, a 50k square foot warehouse with 8 pickers might require 12 robots while a 500k square foot warehouse employing 24 pickers might need 36 robots. The robots lead associates through an optimized, directed workflow and manages dynamic zones in real time. This guarantees that the work associates are directed to complete is always in the most efficient way possible.

However, site designs using swarming robots recommend quantities of robots based on the square footage of a site rather than the number of pickers. Why is this?

In order to be efficient, swarming robots need a high density of robots in the picking area. Otherwise, the walking time between robots increases and efficiency rapidly drops. As the warehouse expands, this problem intensifies. For example, a 500k square foot warehouse would need 10x more swarming robots than a 50k square foot warehouse to maintain the same worker efficiency.

For larger operations, the ROI can not usually justify the cost of such large fleets of swarming robots. In response, swarming robotics companies recommend operational changes. One method is to allocate jobs where picks are all within the same area and focus associates there. However, it is highly unlikely that all of the picks within an order contain SKUs located within the same few aisles. So, assuming they are not singles (which are often better picked in batches) a wave of picks that start as a densely-packed swarm of robots will eventually disperse throughout the warehouse, resulting in a drop in time between picks and a corresponding drop in efficiency. There are a number of ways that swarming robot providers try to address this, such as waving data across the warehouse and zone picking, but they don’t fundamentally address the density problem.

Another method is to sweep across the warehouse (pick everything from left to right or some other pattern). This can reduce robot density requirements a little bit, but it requires all of your associates to repeatedly walk the full length of the warehouse, causing inefficiency both in time and mistakes attributed to fatigue. To fix this problem you can organize associates into zones. However, this just leads us back to the original problem: either there are enough robots to maintain pick face density (and the resulting capital cost of that many robots) or the warehouse is split into zones plagued by unpredictable workloads – some zones will be very busy while others are very light – which leads to overworked and/or idle associates. Ultimately, it is very difficult to effectively deploy a very large number of swarming robots within a large operation.

Worker Productivity

One of the biggest advantages of a directed workflow is you can keep track of and directly influence the productivity of your associates. After removing “the long walk,” the next largest point of lost productivity is time between picks. As a result, this time between picks can be closely monitored and potentially mitigated.

In the graph above, the blue line represents associates walking with a robot from the place they met the robot to their first pick, and the yellow line is the associates walking from the last pick going to meet a new robot, which they do without a robot. As we can see in the chart above, when the associates are left to their own devices, they tend to be both less predictable and take a much longer time. In directed workflows, almost all of the associates movements are guided by the robot and keep them on track.

By comparison, with a swarming approach, the associates are never directed between tasks. The robot can indicate on its screen where the associate can find their next task, but much like what is illustrated by the yellow line in the graph above, associates are always left to pace themselves to that location. By being tasked to pace their own work, swarming robots fail to keep unsupervised associates engaged with the pick faces.

Warehouse Density and Aisle Size

In a warehouse, you are generally trying to maximize the use of floor space. Directed picking robots allow for significantly narrower aisles by utilizing one-way aisles when necessary. This is not surprising, as it is the way that most cart pick operations work to leverage floor density. Essentially, an aisle only needs to be slightly wider than the directed picking robot. For 6 River Systems’ Chuck, the minimum aisle width needed is 42 inches, which fits within most typical aisle layouts. Swarming robot vendors like to point out that Chuck needs wider aisles to pass each other, which is true, but this is not a real-world problem. They point this out to mask a significant shortcoming of swarming robots – large minimum aisle sizes.

Swarming robots, by their very nature, must be able to pass each other all the time or they get stuck. The swarming robot goes to a pick location to wait there for interaction with an associate. After the interaction is complete, it autonomously travels to its next task. If two robots end up in an aisle where they cannot pass each other, the system becomes deadlocked. The associate will complete a task on one robot, which will then seek to move in either direction down the aisle. However, its motion in one direction would be blocked by the next robot and in the other direction, the picker. This means that the minimum aisle size has to be much wider to ensure the robots can pass each other. Swarming robot vendors quote 60” as a minimum size, which is significantly wider than most high density picking aisles.

Carrying Capacity

“Can we put less on that robot?” asked no warehouse operator, ever. Although the size of a robot is not inherently linked to directed vs swarming methodologies, in practice it is much harder to make a large swarming robot since they must be able to pass each other in picking aisles. Due to this fact, swarming robots in the industry tend to be small and provide less carrying capacity. Directed picking robots tend to have a much higher carrying capacity. As a result, picking paths and work is much more efficient.

For example, if a SKU is needed for 6 orders, the directed system will put all 6 of those orders onto the same robot and only visit its location once to retrieve all 6 items. Meanwhile these 6 orders in a swarming system will be allocated to 6 different robots and a picker will visit that same location 6 different times to pick them – increasing congestion on the floor and adding walking time for the associate.

In Conclusion

As volumes spike up and labor gets harder to find, collaborative robots are definitely an important tool for modern fulfillment centers. There are use cases for both Directed Picking and Swarming robots. For small sites that are pallet picking with wide aisles, swarming robots can be very effective. As sites get larger and pick faces density goes up, there are significant advantages of using directed picking robots. If you want more information on how 6 River Systems’ collaborative robots can help your operation, read our latest white paper outlining the business case for collaborative mobile robots.

The post What are the advantages of system-directed robots vs swarming robots? appeared first on 6 River Systems.

The 5 Ws of Working with Google Cloud Functions

Information — Mon, 23 Nov 2020 22:40:39 +0000

Continuing our new Engineering blog series, perception engineer Kayla Comalli will illuminate Google Cloud Functions (GCFs).

A fairly recent addition to the Google Cloud toolbox, Google Cloud Functions (GCFs) provide rapid value through developer applications. It’s always exciting to experience the powerful applications that arise from new tools, and as such it seemed a cool opportunity to discuss the tech behind it. We’ll break down the components of GCF from five high-level angles and explore the resulting applications 6 River Systems (6RS) has made from it all.

Now let’s dig into the who, what, when, where, and why of 6RS’s choice to implement GCFs.

What are Google Cloud Functions?

Oftentimes, when one imagines a cloud computing architecture, there’s an association of some monolithic and über-complicated behemoth. From kubernetes clusters to big data storage to networking, it’s not always a matter of simply grokking these interconnectors. A comprehensive grasp of distant concepts and their jargon may feel like a labyrinthian, yet necessary step in building a bridge between your local application and cloud computing.

Yet despair not! There exists a direct route to networking your apps, as the crow flies, sans barrier.

In 2017, gcloud released a beta feature that evangelized a remedy to the logistical pain points of server builds. These “functions as a service” (FaaS) compose code that performs single-purpose behaviors in response to events. Attach them to triggers like HTTP requests, Pub/Subs, or Cloud Storage, and a custom function that you build automatically executes. A function for data manipulation, intermediate logging, API bridging- anything! Node, Python, Java and Go are all eligible codebases for implementing behavior and it’s all wrapped up in either inline, zipped, or cloud sourced packages.

GCFs flourish to the scene in a fanfaronade of straightforward and rapid project-cloud communication. Rah Rah!

At 6RS, we use GCFs to build a fast and effective process to handle live data on our autonomous mobile robots, which we call Chucks. Because of the built-in trigger functionality, GCFs are a perfect candidate for real-time data streaming. As Chuck leads associates through their tasks, analytical details and telemetry stats are constantly streaming in. GCFs allow a way to transform and store that information as queryable data in a feature that is efficient in both computational cost and execution.

Why Cloud Functions?

The benefits of adapting newer tech vs residing services isn’t always obvious. Why not simply maintain the existing and functioning architectures? Well, alongside adoption of ever-evolving cloud-based options comes grand shifts in company efficiency and scalability. Cutting-edge releases address performance and feature constraints that accompany all legacy applications. Engineering teams can greatly benefit from regularly keeping lookouts on new answers to old questions. Enter: cloud functions.

GCFs offer a cheaper way to compute asynchronous tasks while making task parallelization significantly easier. Monolithic servers certainly have significant benefits, but for tasks such as managing multitudinous data streams, GCF provides a clearcut, atomized avenue. Within an isolated, event-processing environment separate from the full server pipeline, devs can surgically pin down the computational demand to precisely what is needed.

Since the beta 3 years ago, GCFs have continuously delivered, proving to be a robust time saving tool. GCFs operate in a serverless framework, which translates to zero overhead time. That savings in overhead has allowed more logistics features to trickle down to the consumer in a traceable way. Not only this, but the greater modularity that comes packed into these applications opens doors for rapidly testing out proof of concepts; a boon to innovative development.

With such a simplification of process, it’s important to address how security is managed. Security measures for cloud functions can be configured with identity-based or network-based access controls. Identity and access management configuration, environmental variables, ingress traffic options, etc. keep the system as air tight as server setup. Additionally, the atomization of the functions allows an even more granular approach to permission regulation.

Who is the target group?

Seamless integration is the name of the game. 6RS strives to deliver new tools that can bring the gift of continuous automation to the dev workflow. The cloud functions’ trigger-based operability reduces the trickiness of weaving applications to a single, high-level component.

With minimal deployment logistics, autonomy and data engineers can easily use GCF automation to analyze snapshots of events at any point in time on the robot. The payoff of such a feature is threefold:

instantaneous visibility for error handling, either through alerts or dashboards
recreating scenes for dynamic exploration of interrelated components
persistent storage of historical events for direct improvement testing

Where does it apply?

At our customer sites, GCFs are working full time in the background to enable data management with zero downtime. This rapid framework of analytics enables our system to make informed and predictive adjustments. We are constantly developing ways that systems can use real-time AI tools for increased safety, navigation, and interfacing.

Internally, autonomy and analytics teams gain value from GCFs by making telemetry data more accessible, which provides insight on streaming measurements in such areas as calibration, charging, odometry, and mapping, etc. Timing is crucial for minimizing computational costs and with triggers at the end of the pipelines, these measurements are being stored and analyzed at an optimal point.

With an emphasis on ease of integration, further FaaS applications are a dark horse to rapid concept testing, opening the gates for data-based features in the pipeline. Data collection is crucial for training and testing machine learning applications, like computer vision, preventative analysis, and unexpected event handling.

When can we expect more?

6RS is perpetually exploring the next stages of deliverability; our engineering teams have been utilizing functions for heavy-hitting services since mid-2018. As mentioned, FaaS have been the silent powerhouse for much of our analytic processing and storage for over 2 years.

The massive ROI on these pursuits have been apparent, and further expansion is currently being productized as a result. Engineers are building ways to advance post-processing analytics for predictive debugging, machine learning tools and holistic system visibility. Stay tuned!

About the Author

Kayla Comalli works on the Perception team doing computational vision and sensor telemetry.

Originally pivoting careers from biology to coding, she’s currently working on her comp sci degree at Tufts for fun. She’s been an engineer for 5+ years, 3 of which have been at 6RS.

In her ‘free’ time Kayla is doing classwork, programming VR games, reading, or running.

The post The 5 Ws of Working with Google Cloud Functions appeared first on 6 River Systems.

Data Driven Robotics: Leveraging Google Cloud Platform and Big Data to Improve Robot Behaviors

Daniel Grieneisen — Wed, 28 Oct 2020 17:03:12 +0000

In order to digest the multitude of data transmitted from robots at each of our customer sites, engineering teams at 6 River Systems developed a data pipeline to ship data from the robotic operating system (ROS) to Google BigQuery. What does this infrastructure look like and how did we develop it? This blog post will provide an overview of the design and then discuss our open sourced implementation.

First, why did we need to develop this pipeline?

Aren’t there existing technologies for shipping data to a cloud server? Let’s talk about some of our goals for this project.

We needed a system that:

Can collect information from all of our robots at all of our sites and store it in one location.
While there is value in looking at the information from a single robot or single site, we also look for trends across our entire fleet. Using this pathway, we can compare different robots on different software versions at different sites.
Can operate with limited network connectivity.
Our AMRs, which we call Chucks, do not require any additional infrastructure at our customer sites. This allows us to deploy in a fraction of the time of other solutions. However, it comes with some constraints, one of which is that we do not always have perfect network connectivity across the entirety of every warehouse. Our solution needs to be able to buffer data if there is a gap in network coverage.
Does not lose any data.
Analysis is limited by the quality and completeness of the data available. Regardless of what happens to a robot, whether it powers down or loses connectivity, we want to make sure that information from it makes it to the cloud.
Is easy to expand / add information to in the future.
We are in the business of continuous improvement. As we create new features, we must constantly expand the scope of analytics information that we are collecting. It should be simple for any developer to add additional data.
Leverages technologies that we are already using in other parts of our stack.
There is no need to reinvent the wheel if we already have experience with technology.

Given those goals, we developed the following pipeline:

Let’s talk through this block diagram piece by piece.

Local Storage

First, on the bottom of the stack we have processes running on each individual robot. Chucks run using a collection of tools and libraries called ROS (Robot Operating System). ROS is used across a wide array of robotic tech stacks, both in academia and industry. ROS provides a lot of tools. Important to this discussion, ROS has an interprocess communication system called ROS topics which send messages between different nodes. These messages can be recorded to disk in a file called a bag or rosbag. Rosbags are useful because they can be played back offline to view data using the ROS visualization tools.

Chucks are continuously recording two types of rosbags: black box and analytics. The black box bags are like the data recorder on an airplane. They contain high bandwidth sensor data to allow us to recreate what a robot was seeing and doing. These bags are stored on disk and optionally uploaded to the cloud. Analytics bags are used for our data pipeline. Any ROS topic can be included into our analytics bags, though we want to keep the bag size in check so that it does not impact customer network bandwidth. As such, these bags do not contain things such as raw sensor data. A new analytics bag is saved every 2 minutes and then uploaded to the cloud.

Google Cloud Storage

Now we move up the diagram to the cloud portion of the stack. We leverage a number of different technologies that are part of the Google Cloud offerings. Google provides a suite of tools that makes it easy to use their various cloud services.

For our Chuck analytics pipeline, we use a combination of three of their products. First, we use Google Cloud Storage (GCS) as the staging point for our data. The bags are uploaded into a GCS bucket. The storage bucket allows us to keep our data in bag form for easy download to use with other ROS tools. It also provides a number of guarantees around uptime and durability so we know that none of our data will get lost. On the robot, we use Google’s storage SDK and custom code to ensure that each bag gets uploaded into the storage bucket. Our system deletes bags from the robot once they have made it to the cloud and periodically checks to see if any old bags still need to be uploaded. In this way, we can ensure that all of our analytics data makes it into the cloud even if the robot is traveling through an area of poor network connectivity.

Once a bag is uploaded, it triggers the invocation of a Google Cloud Function (GCF). A GCF is a serverless process running in the cloud. It’s a single function that gets called on for each bag. Google provides automatic scaling, so no matter how many robots are running, we always have just enough GCF instances running to handle all the bags being uploaded without any being idle. Our GCF opens the bag up, converts the data, and inserts it into its final storage place in our pipeline.

The Data Warehouse and Analysis

The final two levels of our diagram is 6RS-side. The data from the bag is inserted into a BigQuery. BigQuery is a data warehouse product from Google designed specifically for analytics applications. Data is stored in tables and can be accessed using standard SQL queries or with a variety of different products that integrate with it. At 6RS, we use direct SQL queries for quick analysis, Tableau for data visualization, and Google Colab for data analysis. For example, we store robot velocity data at a regular cadence. This allows us to answer all kinds of questions: what is the average speed across the entire fleet? Are there some individual robots that are unreasonably slow? At a single customer site, are there some aisles which are much slower than others? These insights can be used to monitor our fleet and proactively help our customers. Here is an anonymized example of average robot speed per aisle:

As you can see, most aisles have a similar average speed, but there are a few aisles where the average speed is significantly lower. Note that this chart does not include intra-aisle speed, which is typically significantly faster. We can use this information to help the site management understand why Chucks are slower in those aisles and take steps to mitigate the problem.

But, in this blog post, we are focusing on the pipeline, not the outputs. Check back next month for the next post in this series in which we will discuss in more detail how we can extract useful insights from the data collected in this pipeline.

Now, for the engineers who want a bit more technical insight, let’s look at some of the design decisions that we made when developing this pipeline.

Typescript

First, we chose to write our function in Typescript. We chose this for a few reasons.

At 6RS, most of our code is developed in either Typescript or C++ so we wanted to stick with something familiar to most engineers at the company.

Google provides a nodejs runtime for their cloud functions and a nodejs BigQuery API.

Cruise automation had developed a rosbag parsing library in javascript that we could leverage. Our forked version can be found here.

Serverless

Next, we chose serverless as the framework for deployment. Serverless provides an easy to use framework to deploy into different cloud function infrastructures, including Google, AWS, etc. This project was our first experiment using serverless at 6RS, and we liked how it made the deployment process simpler.

Cloud function

Finally, we had to make a number of design decisions for how we mapped our rosbags of data into BigQuery tables. Our main focus was on having the GCF be essentially “set it and forget it:” we did not want to have to update our parsing function every time we wanted to add new information to our datasets.

We also focused on ease of use for developers. It should be simple to add more data to the dataset and easy to know what it will look like once it is in the database. To do this, we designed our pipeline such that every topic in the bag maps to a table in BigQuery with the same name. If a new topic is added to the bag, the GCF automatically makes a new table. The schema for the table is generated from the structure of the message such that every field in the message becomes a column in the table. Each individual message is a separate row. If we update the message to include new fields in the future, the schema is automatically updated to be the union of the two messages. That way, the function can operate on any version of the data as the new version is rolled across our fleet of robots.

To make data analysis easier, we append common information to every row of the table, including the robot’s pose, the timestamp at which the message was recorded, the robot ID, and the build number. This way, this data does not need to be included in every single ROS message being stored in the analytics bag.

Our google cloud function is available at github.com/6RiverSystems/RosbagToBigQuery. I encourage you to open the repository (at tag v1.0) to see how the following design decisions play out in actual code.

The code in this repository is a slimmed down version of our GCF, but it should be deployable and work to move data from bags into BigQuery. If you encounter any problems, please open an issue or a PR in the repository.

Check back later for the next blog post in this series where we discuss how we can query this data and use insights from it to drive our product decisions.

We’re building a wicked awesome team.

Check our current job postings to see if there’s a role for you.

About the Author

Daniel Grieneisen is a Principal Software Engineer in charge of robotic movement and behavior at at 6 River Systems.

Dan has worked on robotic movement for his whole career. He helped develop a variety of robotic platforms, including an autonomous off road vehicle, a hospital delivery robot, an autonomous tugger for distribution warehouses, a number of different micro aerial vehicles, and now Chuck.

Dan has a B.S.E. from Olin College of Engineering and an M.Sc. in Robotics, Systems and Controls from ETH Zürich.

The post Data Driven Robotics: Leveraging Google Cloud Platform and Big Data to Improve Robot Behaviors appeared first on 6 River Systems.