List of Accepted Papers
Abstract: Autonomous Vehicles have to know their pose, speed and heeding because these information in needed for many fundamental tasks. Although GPSs are very useful for it, they have a few drawbacks in urban applications. Visual odometry is an alternative or complementary method because it uses a sensor already available in many vehicles for other tasks and because provides the ego motion of the vehicle with enough accuracy. In this paper, a new method is proposed which detects and tracks features available on the surface of the ground due to the texture of the road or street and road markings. This way it is assured only static points are taking into account in order to obtain the relative movement between images. A Kalman filter is used taking into account the Ackermann steering restrictions. Some results in real urban environment are shown in order to demonstrate the good results of the algorithm.
Abstract: Convolutional Neural Networks (CNNs) beat the human performance on German Traffic Sign Benchmark competition. Both the winner and the runner-up teams trained CNNs to recognize $43$ traffic signs. However, both networks are not computationally efficient since they have many free parameters and they use highly computational activation functions. In this paper, we propose a new architecture that reduces the number of the parameters 27% and 22% compared with the two networks. Furthermore, our network uses Leaky Rectified Linear Units (ReLU) activation function. Compared with 10 multiplications in the hyperbolic tangent and rectified sigmoid activation functions utilized in the two networks, Leaky ReLU needs only one multiplication which makes it computationally much more efficient than the two other functions. Our experiments on the German Traffic Sign Benchmark dataset shows 0.6% improvement on the best reported classification accuracy while it reduces the overall number of parameters and the number of multiplications 85% and 88%, respectively, compared with the winner network in the competition. Finally, we inspect the behaviour of the network by visualizing the classification score as a function of partial occlusion. The visualization shows that our CNN learns the pictograph of the signs and it ignores the shape and color information.
Abstract: Particle filters are sequential Monte Carlo estimation methods with applications in the field of mobile robotics for performing tasks such as tracking, simultaneous localization and mapping (SLAM) and navigation, by dealing with the uncertainties and/or noise generated by the sensors as well as with the intrinsic uncertainties of the environment. This work presents a field programmable gate arrays (FPGA) implementation of a particle filter applied to SLAM problem based on a low cost Neato XV-11 laser scanner sensor. Post processing is performed on data provided by a realistic simulation of a differential robot, equipped with a hacked Neato XV-11 laser scanner, that navigates in the Robot@Factory competition maze. The robot was simulated using SimTwo, which is a realistic simulation software that can support several types of robots. The simulator provides the robot ground truth, odometry and the laser scanner data. The results achieved from this study confirmed the possible use such low cost laser scanner for different robotic applications and the effectiveness of FPGA's apply for SLAM applications.
Abstract: Laser scanners are widely used in mobile robotics localization systems but, despite the enormous potential of its use, their high price tag is a major drawback, mainly for hobbyist and educational robotics practitioners that usually have a reduced budget. In this paper it is presented the modeling and simulation of a hacked Neato XV-11 Laser Scanner, having as motivation the fact that it is a very low cost alternative, when compared with the current available laser scanners. The modeling of a hacked Neato XV-11 Laser Scanner allows its realistic simulation and provides valuable information that can promote the development of better designs of robot localization systems based on this sensor. The sensor simulation was developed using SimTwo, which is a realistic simulation software that can support several types of robots.
Abstract: This paper proposes a novel method for appearance based vehicle detection by employing stereo vision system and radar units. In the vein of utilizing advanced driver assistance systems, detection and tracking of moving objects or particularly vehicles, represents an essential task. For the merits of such application, it has often been suggested to combine multiple sensors with complementary modalities. In accordance, in this work we utilize a stereo vision and two radar units, and fuse the corresponding modalities at the level of detection. Firstly, the algorithm executes the detection procedure based on stereo image solely, generating the information about vehicles' position. Secondly, the final unique list of vehicles is obtained by overlapping the radar readings with the preliminary list obtained by stereo system. The stereo vision--based detection procedure consists of (i) edge processing plugging in also the information about disparity map, (ii) shape based vehicles' contour extraction and (iii) preliminary vehicles' positions generation. Since the radar readings are examined by overlapping them with the list obtained by stereo vision, the proposed algorithm can be considered as high level fusion approach. We analyze the performance of the proposed algorithm by performing the real-world experiment in highly dynamic urban environment, under significant illumination influences caused by sunny weather.
Abstract: Robust image feature methods stand for a tool allowing monocular visual navigation of mobile robots. The herein referred methods similar to original SURFnav [10] relies on detection of stable interest points (features) in the camera image. The interest points processing together with robot odometry adjusts the robot bearing along its trajectory. This allows re-execution of previously gathered paths using stored robust image feature model of the path. In this paper, we present an expansion to this method. By using additional visual information, a need for reliable dead-reckoning is obviated. The step enables to perform a purely vision-based navigation. To keep the procedure computationally efficient, the method takes the advantage of reusing the same image features as applied to the core navigation procedure anyhow.
Abstract: Perception is the process by which an intelligent vehicle translates sensory data into an understanding of the world around it. Perception of the dynamic environments is one of the key components for the intelligent vehicles operate in real-world environments. This paper proposes a method for static/dynamic modeling of the environment. The proposed system comprises two main modules: (i) a module which estimates the ground surface using a piecewise surface fitting algorithm, and (ii) a voxel-based static/dynamic model of the vehicle's surrounding environment using discriminative analysis. The proposed method is evaluated using KITTI dataset. Experimental results demonstrate the applicability of the proposed method.
Abstract: The number of LIDAR sensors installed in robotic vehicles has been increasing, which is a situation that reinforces the concern of sensor calibration. Most calibration systems rely on manual or semi-automatic interactive procedures, but fully automatic methods are still missing due to the variability of the nearby objects with the point of view. However, if some simple objects could be detected and identied automatically by all the sensors from several points of view, then automatic calibration would be possible on the fly. This is indeed feasible if a ball is placed in motion in front of the set of uncalibrated sensors allowing them to detect its center along the successive positions. This set of centers generates as many point clouds as sensors, which, by using segmentation and tting techniques, allows the calculation of the rigid body transformation between all pairs of sensors. This paper proposes and describes such a method with encouraging preliminary results.
Abstract: This paper presents an algorithm to perform pedestrian pose estimation using a stereo vision system in the ADAS context. The proposed approach isolates the pedestrian point cloud and extracts the pedestrian pose using a visibility based pedestrian 3D model. The model accurately predicts possible self occlusions and uses them as an integrated part of the detection. The algorithm creates multiple pose hypotheses that are scored and sorted using a scheme reminiscent of the Monte Carlo techniques. The technique performs a hierarchical search of the body pose from the head position to the lower limbs. In the context of road safety, it is important that the algorithm is able to perceive the pedestrian pose as quickly as possible to potentially avoid dangerous situations, the pedestrian pose will allow to better predict the pedestrian intentions. To this end, a single pedestrian model is used to detect all pertinent poses and the algorithm is able to extract the pedestrian pose based on a single stereo depth point cloud and minimal orientation information. The algorithm was tested against data captured with an industry standard motion capture system. Accurate results were obtained, the algorithm is able to correctly estimate the pedestrian pose with acceptable accuracy. The use of stereo setup allows the algorithm to be used in many varied contexts ranging from the proposed ADAS context to surveillance or even human-computer interaction.
Abstract: In this paper, we present a novel methodology to compute a 3D scene representation. The algorithm uses macro scale polygonal primitives to model the scene. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Results show that the approach is capable of producing accurate descriptions of the scene. In addition, the algorithm is very efficient when compared to other techniques.
Abstract: The manuscript evaluates the performance of a monocular visual odometry approach when images from different spectra are considered, both independently and fused. The objective behind this evaluation is to analyze if classical approaches can be improved when the given images, which are from different spectra, are fused and represented in new domains. The images in these new domains should have some of the following properties: $i)$ more robust to noisy data; $ii)$ less sensitive to changes (e.g., lighting); $iii)$ more rich in descriptive information, among other. In particular in the current work two different image fusion strategies are considered. Firstly, images from the visible and thermal spectrum are fused using a Discrete Wavelet Transform (DWT) approach. Secondly, a monochrome threshold strategy is considered. The obtained representations are evaluated under a visual odometry framework, highlighting their advantages and disadvantages, using different urban and semi-urban scenarios. Comparisons with both monocular-visible spectrum and monocular-infrared spectrum, are also provided showing the validity of the proposed approach.
Program
Final program still to be defined. Peliminary proposal:
- Session 1, Auditorium - 19 Nov 2015, 10:30 - 12:30, 6 papers
- Session 2, Auditorium - 19 Nov 2015, 14:00 - 16:00, 5 papers