Figure 3.5: The Microsoft Kinect depth camera.
The Microsoft Kinect, seen in Figure 3.5, is a commercially available sys- tem that can be used to create depth maps and to control the altitude of a multi-rotor (Stowers et al., 2011). Inside the system, a proprietary light- ing technique is used which utilises infra-red light to illuminate a scene. Although the technique is not directly stated by Microsoft, the Microsoft Kinect V1 utilises a special Infra-red Depth sensor pattern to determine the distance. That is, the infra-red emitter emits several infra-red beams in a pattern to determine the distance. When the infra-red depth sensor receives
the infra-red beams, it will then calculate the distance9. The Kinect uses a monochrome CMOS camera and
another RGB camera to capture images. With a horizontal field of view of 57◦and a vertical field of view of 43◦, the Kinect can achieve accuracies up to 1-4 cm and has a frame rate of 10 Hz (Obdrzalek et al., 2012, Stoyanov et al., 2013). Although the sensor is inexpensive in comparison to other depth cameras, the Kinect camera is easily affected by external light and reflective objects (Beltran & Basañez, 2014).
7http://wiki.ros.org/velodyne_pointcloud 8http://wiki.ros.org/pointcloud_to_laserscan
The Microsoft Kinect data, just like the SwissRanger and SPAD data, provided depth data. To convert the scene capturing depth data into rotating laser data, the ROS node depthimage_to_laserscan10 was utilised.
This node examines incoming depth data and takes a horizontal line along the depth image at a specified height. The horizontal line will go across the entire depth image to match the same horizontal field of view as the original depth image. The horizontal line is then published out as rotating laser depth data which can then be utilised in SLAM algorithms.
3.4.3.2 Intel RealSense R200
The Intel RealSense R200 is a commercially available scene capturing sensor that operates on a structured light principle. Due to commercial confidentiality, not much is known about the exact principle it uses. Due to its relatively small size, low power requirements, and the relatively low cost has made it a very popular choice in robotics since its release11.
The Intel RealSense R200 provides distance data in the form of scene capturing depth images. The scene capturing depth images have a very large data size and can often take a while to load and convert to rotating laser depth data. To convert the scene capturing depth data to rotating laser depth data, the same ROS node was used when the Microsoft Kinect scene capturing depth data was converted to rotating laser depth data. When the RealSense data was converted from scene capturing to rotating depth data, it was noted that the file size for the data was severely reduced. When the data size was reduced, the data was processed faster through the SLAM algorithm and reduced computing time.
3.4.4 Confidence in data
With some of the sensors utilised in this study, the sensor was able to provide an estimate of how confident it was in each distance measurement. For example, if the sensor calculated that it had a low confidence level in a specific reading, the confidence number that it calculated for that specific reading would be low. This information is useful for SLAM, because ranging measurements that are likely to be incorrect can be ignored. The sensors used here that had confidence data were:
• SwissRanger 4000 (9 metre) • SPAD Sensor
Each sensor had a different way to measure confidence.
3.4.4.1 SwissRanger 4000 (9 metre)
The SwissRanger camera outputs a confidence value between 0 and 65536 for each pixel. We filtered out all points below 12816 as this was found to greatly improve the quality of the SLAM result.
10http://wiki.ros.org/depthimage_to_laserscan
3.4.4.2 SPAD Sensors
To calculate the confidence data in the SPAD sensor, the amplitude data was divided by the background data for each pixel, effectively giving a signal-to-noise ratio. Then using the same filtering method that was employed for the SwissRanger, the unwanted data was removed.
3.4.5 Time synchronisation
During data acquisition in the main experiment, an issue of how time would be synchronised with numerous laptops arose. Numerous laptops had to be utilised due to an insufficient number of USB ports on any single laptop. For some experimental runs, up to seven different sensors were simultaneously recording data at the same time and this did not include the Optitrack System that was also utilised to record Odometry data. To timestamp and synchronise the data, a Network Time Protocol (NTP) server was utilised. The NTP server was installed on all the laptops and computers utilised in the experiment. Before any of the data was recorded before the experimental runs, each computer re-synchronised the NTP server to ensure that all the laptops will time stamp the data correctly. It is vital to time stamp the data so that the odometry data and the ranging sensor data can be matched and then input into the SLAM algorithms.