Palo Alto, California-based Augmented Pixels, a computer vision research and development company, calls the technology SLAM, or simultaneous location and mapping. It is targeting SLAM at robots, drones, AR, and VR.
The module can also be used for inside-out tracking for augmented reality and virtual reality headsets. That means that it figures out the position of the headset using a camera that is on the headset.
“Augmented Pixels currently has the fastest proprietary SLAM for mono and stereo cameras, as well as sensor fusion and technologies for autonomous navigation (obstacle avoidance, point cloud semantics, etc.) on the market,” said Vitaliy Goncharuk, CEO of Augmented…
This is an exciting time for those of us in computer vision — we’re seeing it merge with AI to enable all kinds of new possibilities. At the LDV Vision Summit in New York a few weeks ago, I came away with five key insights about where computer vision will impact AI:
1. Smart assistants will battle it out over vision
AI needs data with which to learn and process, and as we move closer to more “human”-like AI, it will increasingly need visual data. “This is one of the reasons all the major companies are at war to own the visual data of our activities,” said LDV Capital’s Evan Nisselson. “To do that, they need to own the camera.” Amazon recently added a camera to its Alexa-powered Echo, for example, and Google (Lens) and Facebook recently made new recent augmented reality announcements.
2. Optics alone could be enough to direct self-driving cars
We are seeing debate over whether self-driving cars need LiDAR or can depend solely on optical solutions. Tesla CEO Elon Musk, for example, doesn’t think that LiDAR, a bulky and expensive device that uses lasers to maps its environment in real time, is necessary for fully-autonomous driving. Wheras Humatics CTO Gregory Charvat said at the vent that cars “need more than just optical sensor platforms [cameras], they also need LiDAR, radar, and high-precision radio navigation more precise than differential GPS.”
LiDAR and radar work by pinpointing actual objects in the surrounding environment by range and angle, whereas deep learning-based camera solutions need to run images through algorithms and are ultimately still predictions. Optical solutions are nevertheless better at actually identifying what something is — for example, a pedestrian versus a bunch of pixels that look like a Christmas tree, as Auto X Founder and CEO Jianxiong Xiao showed during a demo of his company’s impressive and low-cost self-driving solution that only uses cameras.
Technology pros and cons aside, car companies typically work five years in advance, so the necessary hardware would need to be purchased now to make a 2021 deadline. For now, LiDAR and more advanced forms of radar are still expensive ($80,000 is considered cheap for the former) and bulky. Meanwhile, operating all these optical and sensor technologies in a fused way needs supercomputers small enough…
Google CEO Sundar Pichai today announced that the company’s speech recognition technology now has achieved a 4.9 percent word error rate. Put another way, Google transcribes every 20th word incorrectly. That’s a big improvement from the 23 percent the company saw in 2013 and the 8 percent it shared two years ago at I/O 2015.
The tidbit was revealed at Google’s I/O 2017 developer conference, where a big emphasis is artificial intelligence. Deep learning, a type of AI, is used to achieve accurate image recognition and speech recognition. The method involves ingesting lots of data to train systems called neural networks, and then feeding new data to those systems in an attempt to make predictions.
“We’ve been using voice as an input across many of our products,” Pichai said onstage. “That’s…
Facebook Artificial Intelligence Research (FAIR) today announced plans to launch a testing environment in which AI researchers and bot makers can share and iterate upon each other’s work.
While the initial focus is on open-sourcing the dialogue necessary to train machines to carry on conversations, other research on ParlAI will focus on computer vision and fields of AI beyond the natural language understanding required for this task. The combination of smarts from multiple bots and bot-to-bot communication will also be part of research carried out on ParlAI.
Researchers or users of ParlAI must have Python knowledge to test and train AI models with the open source platform. The purpose of ParlAI, said director of Facebook AI Research Yann LeCun, is to “push the state of the art further.”
“Essentially, this is a problem that goes beyond any one heavily regarded dialogue agent that has sufficient background knowledge. A part of that goes really beyond strictly getting machines to understand language or being able to understand speech. It’s more how do machines really become intelligent, and this is not something that any single entity — whether it’s Facebook or any other — can solve by itself, and so that’s why we’re trying to sort of play a leadership role in the research community and trying to direct them all to the right problem.”
Nvidia and SAP have teamed up to use artificial intelligence and computer vision to figure out how many times a brand appears in the real world.
Somewhere, somebody whose job it is to count how many times a logo appears on a race car in front of a TV camera or a crowd in the real world is saying thanks. Normally, it takes humans a lot of work to estimate how many advertising impressions are made in the real world. Nvidia showed a demo of the capability at its GPU Technology Conference in San Jose, Calif.