DEVELOPMENT OF A PORTABLE HIGH PERFORMANCE MOBILE MAPPING 
SYSTEM USING THE ROBOT OPERATING SYSTEM 

 
S. Blaser 1, S. Cavegn 1, 2, S. Nebiker 1 
 

1 Institute of Geomatics, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland 
- (stefan.blaser, stefan.cavegn, stephan.nebiker)@fhnw.ch 

2 Institute for Photogrammetry, University of Stuttgart, Germany 
 

Commission I, WG I/7 
 
 
KEY WORDS: Mobile Mapping, Indoor, Robot Operating System, ROS, SLAM, Sensor Orientation, Infrastructure 
 
 
ABSTRACT: 
 
The rapid progression in digitalization in the construction industry and in facility management creates an enormous demand for the 
efficient and accurate reality capturing of indoor spaces. Cloud-based services based on georeferenced metric 3D imagery are already 
extensively used for infrastructure management in outdoor environments. The goal of our research is to enable such services for indoor 
applications as well. For this purpose, we designed a portable mobile mapping research platform with a strong focus on acquiring 
accurate 3D imagery. Our system consists of a multi-head panorama camera in combination with two multi-profile LiDAR scanners 
and a MEMS-based industrial grade IMU for LiDAR-based online and offline SLAM. Our modular implementation based on the 
Robot Operating System enables rapid adaptations of the sensor configuration and the acquisition software. The developed workflow 
provides for completely GNSS-independent data acquisition and camera pose estimation using LiDAR-based SLAM. Furthermore, we 
apply a novel image-based georeferencing approach for further improving camera poses. First performance evaluations show an 
improvement from LiDAR-based SLAM to image-based georeferencing by an order of magnitude: from 10-13 cm to 1.3-1.8 cm in 
absolute 3D point accuracy and from 8-12 cm to sub-centimeter in relative 3D point accuracy. 
 
 
1. INTRODUCTION 

Digitalization trends on a broad scale lead to a rapid and massive 
transformation of the construction and real estate industries. New 
methods and tools such as BIM (Building Information Modeling) 
and VDC (Virtual Design and Construction) in combination with 
new technologies for geospatial data capture and exploitation 
enable a paradigm shift in the way buildings are designed, tested, 
built, maintained and refurbished. The establishment of fully 
three-dimensional collaborative processes and workflows with 
stakeholders from multiple domains require accurate, detailed 
and up-to-date 3D geodata. New image-based mobile reality 
capturing techniques in combination with cloud technologies, 
such as presented by Nebiker et al. (2015), hold the potential to 
provide such data and services in a rapid, cost-efficient and user-
friendly manner. As shown by Puente et al. (2013) current 
commercial mobile mapping systems (MMS) are dominated by 
LiDAR as primary sensors and with cameras as complimentary 
sensors. However, first road- and railway-based mobile mapping 
experiments were based on stereo camera systems and date back 
to the early 1990ies (Novak, 1991; Schwarz et al., 1993). Since 
then image-based outdoor mobile mapping systems have evolved 
into multi-stereo systems (Cavegn & Haala, 2016) and into high-
performance 360° stereo systems (Blaser et al., 2017; Meilland 
et al., 2015) with an unparalleled information richness and 
density. 

In terms of positioning technologies, the vast majority of outdoor 
mobile mapping solutions currently rely on direct georeferencing 
using GNSS and INS. However, alternative approaches are 
required for positioning and pose estimation in indoor 
environments and in environments with poor GNSS coverage 
such as forests or urban canyons. Recent developments in indoor 
localization and mapping benefit from and build on methods and 

techniques, namely SLAM (simultaneous localization and 
mapping), from the robotics and more recently from the computer 
vision communities (Stachniss et al., 2016; Thrun, 2002). Indoor 
mobile mapping not only requires new georeferencing strategies, 
but also new platforms. Solutions using carts, like the Viametris 
iMMS (Thomson et al., 2013), work well in large unobstructed 
spaces but are not suitable for typical indoor environments with 
stairs, closed doors, obstructed floors due to ongoing 
construction etc. Thus, the focus in research and commercial 
development has shifted towards portable or ‘personal’ mobile 
mapping systems (Lehtola et al., 2017; Nüchter et al., 2015). But 
with very few exceptions, such as the image-based UltraCam 
Panther (Vexcel, 2018), most developments focus on LiDAR-
based systems and point clouds. 

In this paper, we introduce the BIMAGE backpack, a portable 
high-performance mobile mapping research platform and system 
built on top of the Robot Operating System (ROS). The system 
is designed for the creation of 3D image data spaces and for the 
image-based building and infrastructure management (Nebiker et 
al., 2015). It features LiDAR SLAM-based real-time 3D mapping 
with a subsequent novel image-based georeferencing approach 
using relative orientation constraints leading to significant 
improvements in relative and absolute accuracies.  

First, we discuss the modular mechanical, electronic and software 
design of the BIMAGE backpack, allowing the integration of 
multiple medium- to low-cost LiDAR sensors and panoramic 
camera sensors. In particular, we introduce a low-cost 
architecture for the mission-critical accurate triggering and 
synchronization of all types of sensors involved. We also provide 
details on the calibration of the multi-sensor system. The 
acquisition system provides a number of novel features, such as 
real-time 3D positioning and mapping capabilities and geometry-
driven camera triggering for optimal data coverage. Our highly 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
13


automated processing workflow offers both LiDAR SLAM- and 
image-based georeferencing. In a performance evaluation, we 
demonstrate that our novel image-based georeferencing approach 
offers an improvement in relative and absolute accuracies in the 
order of magnitude under real-world conditions. 
 
 
2. SYSTEM REQUIREMENTS 

In order to create image-based services for indoor infrastructure 
management with accurate 3D measurement functionality, we 
need an image-based indoor mobile mapping platform for 
efficient data acquisition.  
As the sensor technology improves rapidly, a prototypic indoor 
mobile mapping system for research purposes should provide the 
flexibility to easily integrate new components. Furthermore, it 
should also be possible to implement and investigate different 
sensor configurations. In order to achieve this flexibility, a 
modular design is indispensable. This affects both hardware and 
software. A trolley system like NavVis (Szwarc, 2017) would 
offer more flexibility and fewer limitations in terms of weight 
and size than a portable system. However, the applications of a 
trolley system are limited. For example, a trolley system cannot 
be used for capturing staircases or construction sites. For our 
research, it is important that the system is able to cover as many 
use cases as possible.  
As in our earlier mobile mapping solution (Burkhard et al., 2012), 
it should be possible to derive 3D imagery through dense image 
matching, ideally assigning a depth value to each image pixel. 
These 3D images are subsequently aggregated into 3D image 
data spaces, which can easily be navigated and used for simple 
3D measurements in single images (Nebiker et al., 2015). Ideally, 
a camera configuration should cover the indoor environment with 
multiple views and as completely as possible. For measuring 
purposes, both precise camera calibration and precise image 
orientation are essential. For indoor use, the camera has to 
perform well in low light condition and should be able to adapt 
rapidly to changing lighting conditions. Since the platform is 
mobile, the cameras must be triggered synchronously, ideally 
with an electronic trigger signal. 
Precise georeferencing is essential in mobile mapping. Inertial 
navigation systems (INS) combining GNSS and IMU are widely 
used for direct georeferencing in outdoor mobile mapping 
applications. The absence of GNSS in indoor environments 
precludes this sensor combination from being applied to indoor 
mobile mapping. A possible way to overcome the lack of GNSS 
is to replace INS with LiDAR-based simultaneous localization 
and mapping (SLAM) (Hess et al., 2016). 
An approach to increase the accuracy of direct georeferencing is 
to apply an additional image-based georeferencing. With this 
approach Cavegn et al. (2016) achieved an accuracy 
improvement by an order of magnitude in urban street 
environments. However, the success of this approach depends on 
the number and distribution of detected and matched features in 
the outdoor environment. We assume that image-based 
georeferencing will provide a significant accuracy improvement 
in structurally rich indoor environments. 
In contrast to most commercial indoor mobile mapping systems 
(Leica Geosystems, 2018; Vexcel, 2018), our system should 
support the complete system initialization in an indoor 
environment. For reasons of efficiency, this is essential for the 
mapping of large building complexes. In addition, a status 
indicator should provide real-time information about 
completeness and accuracy of data acquisition. 
 
 
3. SYSTEM COMPONENTS 

Our portable mobile mapping system consists of different 
components for navigation, mapping, data registration and data 
pre-processing. A panorama camera captures images in order to 
map the entire indoor environment. Multi-profile laser scanner 
data and IMU data fused in a LiDAR SLAM algorithm deliver 
the system position and attitude. A computer with high RAM and 
storage capacities and low energy consumption processes the 
SLAM algorithm and registers the captured data. 
 
3.1 Panorama Camera 

For image-based indoor mapping, the system uses a FLIR 
Ladybug5 panorama camera. The Ladybug5 panorama camera 
consists of six different single camera heads. Five camera heads 
are pointing sideways, one camera head is pointing upwards. 
Each camera head has a resolution of 5MP and features ultra-
wide-angle optics (see Table 1). Previous investigations had 
shown that the equidistant camera model (fisheye) optimally fits 
the Ladybug5 camera heads (Blaser et al., 2017). A general-
purpose input and output interface (GPIO) features several 
triggering modes with an electronic signal (FLIR Inc., 2017).   

 
Sensor Type Sony ICX655 CCD 
Shutter Type Global shutter 
Resolution [px] 2448 x 2048 
Pixel size [µm] 3.45 
Focal length [mm] 4.3 
Field-of-view [deg] 113 x 94 

Table 1. Specifications of the panorama camera. Technical data 
of an individual Ladybug5 camera head (FLIR Inc., 2017). 

 
3.2 Laser Scanner 

The system uses two multi-profile Velodyne VLP-16 laser 
scanners for navigation as well as for mapping. Velodyne 
designed the VLP-16 as a rugged medium- to low-cost laser 
scanner primarily for collision avoidance applications in the 
automotive industry. The multi-profile laser scanner consists of 
16 different laser modules mounted in a compact housing. The 
housing spins with an adjustable rate in-between 5 and 20 
rotations per second (Velodyne, 2016). Therefore, the laser 
scanner delivers a horizontal full field-of-view (see Table 2). The 
horizontal resolution is limited by the speed of the laser modules 
and the rotation rate. In contrast, the 16 fixed laser modules limit 
the vertical field-of-view as well as the vertical resolution. 
Glennie et al. (2016) provide a calibration and a stability analysis 
of the Velodyne VLP-16 laser scanner. Their evaluations reveal 
a long-term walk in individual lasers. Thus, a geometric 
calibration does not remain temporally stable. The VLP-16 
supports time synchronization with a pulse-per-second (PPS) 
signal in conjunction with a one-per-second NMEA sentence 
(Velodyne, 2016). 
 

Max. range [m] 100 
Typical accuracy [cm] 3.0 
Number of channels 16 
Angular resolution (H x V) [deg] 0.1-0.4 x 2.0 
Field-of-view (H x V) [deg] 360 x 30 
Rotation rates [Hz] 5-20 
Max. points per second 300’000 

Table 2. Specifications of the Velodyne VLP-16 Puck LiDAR 
scanner (Velodyne, 2016) 

 
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
14


3.3 Inertial Measurement Unit 

The system contains an XSens MTI-300 IMU for navigation. The 
MTI-300 is a MEMS-based attitude heading reference system 
(AHRS), which returns drift-free roll and pitch as well as 
magnetic referenced yaw from an extended Kalman filter fusing 
raw data from gyroscopes, accelerometers and magnetometers 
(Kooi, 2014). Further processing algorithms of our system 
exclusively use raw accelerometer and gyroscope data, which are 
accessible as well. Based on its specifications (see Table 3), the 
MTI-300 fits the classification of an industrial grade IMU. The 
IMU features a clock synchronization with a pulse per second 
(PPS) signal. 
 

 Gyroscope Accelerometer 
Bias repeatability (1 yr) 0.5 deg/s 0.05 m/s2 
In-run bias stability 10 deg/h 40 µg 
Noise density 0.015 deg/s/√Hz 150 µg/√Hz 
Table 3. Specifications of the MEMS-based XSens MTI-300 

AHRS IMU (Kooi, 2014) 
 
3.4 Computer 

The mobile mapping system is equipped with a minicomputer 
Prime Mini Pro with high RAM and data storage capacities and 
with a processor for mobile computers (see Table 4). The 
computer has low energy consumption and no mechanically 
moving parts. 

 
Processor Intel Core i5-5300U vPro 
Graphic card Intel HD Graphics 5500 
RAM 32 GB DDR3L – 1600MHz 
Data storage 1x 500 GB SSD 

1x 2000 GB SSD 
Interfaces 4x USB 3.0 

1x GigE 
2x Mini-DP 1.2 

Table 4. Specifications of the on-board computer Prime Mini 
Pro (Prime Computer AG, 2017) 

 
3.5 Flashlights 

Self-designed, strip-shaped LED elements provide a flashlight 
for the artificial lighting of low-light indoor environments. We 
glued commercial low-cost LED strips onto an aluminum rail 
with a matt cover, which diffuses the light. A strip-shaped LED 
element with a length of 50 centimeter generates a luminous flux 
between 1200 and 1500 lm. 
 
 
4. SYSTEM CONFIGURATION AND CAMERA 
CALIBRATION 

When creating our prototypic mobile mapping system, we 
generally focused on a flexible design for the hardware and 
software components. We equipped our system with a 360° 
panorama camera primarily for the use as a mapping sensor. Our 
aim is to orient single panorama images in indoor environments 
with the aid of two LiDAR sensors and an IMU as navigation 
sensors as precisely as possible. We implemented a time 
synchronization based on electric signals between all sensors to 
avoid systematic errors caused by software-induced time shifts. 
To further increase the accuracy, we performed a complete 
panorama camera calibration. 
 

4.1 Mechanical Design 

We assembled all sensors and components on a robust aluminum 
frame. The modular design allows rapid changes to the sensor 
configuration. 
We fixed the multi-head panorama camera on the frame so that 
the first camera head (cam0) faces backwards. The other four 
sidewards facing camera heads (cam1-cam4) are arranged in 
clockwise order. The sixth camera head (cam5) faces to the top. 
Moreover, we tilted the panorama camera by a few degrees to 
achieve a roughly horizontal camera plane when a person carries 
the backpack (see Figure 1, no. 2). The overlapping of images 
allows stitching them together to a panorama image, so that the 
surrounding frame is barely visible. 
One of the two LiDAR scanners sits on top of the frame and is 
tilted by 30° to cover mainly the walls as well as some parts of 
the floor and the ceiling (see Figure 1, no. 1). In contrast, the 
second vertically mounted LiDAR scanner (see Figure 1, no. 5) 
covers mainly the floor and the ceiling. The second LiDAR is 
complementary to the first one and gives additional geometric 
stability to the resulting point cloud. We fixed the IMU stably in 
the lower part of the frame (see Figure 1, no. 7). 
 

Figure 1. System overview demonstrating our backpack indoor 

mobile mapping system 
 

4.2 Electronic Design 

A 12V lithium ion battery with a capacity of 20Ah (see Figure 1, 
no. 6) supplies power for the system. The single board computer 
II (Arduino Nano) synchronizes the time between the sensors and 
triggers the camera and the flashlights with an electronic trigger 
signal. The computer II contains a GPS simulator and produces a 
pulse per second (PPS) signal and an NMEA message per second 
to synchronize both LiDAR scanners (see Figure 2). A PPS signal 
also synchronizes the internal IMU clock. A post-processing step 
solves the ambiguity of the internal IMU clock, using the NMEA 
message, which the computer II sends directly to computer I. 
For triggering the camera, computer I (Prime Mini Pro) sends a 
trigger command to computer II. Then computer II triggers the 
camera and the flashlights with an electronic signal and sends a 
message with the trigger timestamp back to computer I. All 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
15


sensors send the data to computer I for registration and pre-
processing. 
 

Figure 2. Schema with sensor synchronization and data flow 

 
4.3 Software Design 

Our acquisition software runs on computer I with a Linux Ubuntu 
16.04 installation. Each external device with Virtual Network 
Computing (VNC) support is able to control computer I via 
remote desktop over a WLAN connection. We implemented our 
entire acquisition software with the Robot Operating System 
(ROS) framework. Quigley et al. (2009) give an introduction into 
the principles, paradigms and functionality of ROS. Thanks to the 
open source philosophy and the wide distribution in robotics, 
there are numerous existing tools and a comprehensive hardware 
support. Particularly noteworthy is the flexible graph-based 
communication concept consisting of nodes, messages, topics 
and services. A node describes a software module, which we 
represent as a circle in our software schema (see Figure 3). Nodes 
can communicate with each other by passing messages in a 
strictly predefined data structure through given topics. In our 
software schema, we represent topics as rectangles (see Figure 
3). Either, a node can publish messages to one or more topics (red 
arrows) or can subscribe to one or more topics to receive 
messages (blue arrows). By contrast, services offer synchronous 
communication. A node can advertise a service with predefined 
data structures of both request and response, similar to a web 
service. We represent services and topics as rectangles in our 
software schema (see Figure 3) but present the communication 
with services with green arrows. 
For both LiDAR scanners, we use the Velodyne driver with 
minor modifications by Withley (2016). The driver converts the 
sensor raw data into ROS pointcloud2 messages and 
publishes them. For the IMU, we use the XSens driver by Colas 
(2016) which converts the raw data into ROS Imu messages. For 
post-processing, the ROS node “Bag” stores the IMU messages 
as well as the messages from both LiDAR scanners in a so-called 
bag file. Furthermore, we fuse the LiDAR and IMU data with the 
3D LiDAR SLAM Cartographer. Hess et al. (2016) introduce the 
functionality of Google’s Cartographer exemplarily with their 2D 
LiDAR SLAM. Our new Conditional Trigger Node supports 
spatial trigger criteria, such as distance and orientation 
differences, as well as non-spatial criteria such as time difference 
and triggers a command if necessary. The Conditional Trigger 
Node uses the system pose coming from the Cartographer to 

calculate spatial differences. The node of Computer II advertises 
a trigger command service, publishes the trigger time and logs all 
incoming messages from Computer II. As described in chapter 
4.2, Computer II triggers the camera and the flashlights 
hardware-based. 
Furthermore, we use the driver by Rockey & Purvis (2015) for 
the Panorama Camera, which returns all six images from the 
camera heads strung together in one image. The Image Slicer 
node re-separates the images and the different Image Writer node 
instances write the single images. 

 
Figure 3. Schema with the implemented ROS software 

structure. ROS nodes are represented as circles and ROS topics 
as rectangles. Red arrows depict ROS publishers, blue arrows 

ROS subscribers and green arrows ROS services. 

 
4.4 Camera Calibration 

The calibration of the multi-head panorama camera consists of 
the interior orientation parameters (IOPs) of each camera head as 
well as the relative orientation parameters (ROPs) of each camera 
head to the first camera head (cam0). We used the constrained 
multi-system calibration software implemented by Burkhard et 
al. (2012) that is based on the bundle adjustment approach by 
Kersting et al. (2012). Blaser et al. (2018) extended this software 
with fisheye camera support using the equidistant camera 
projection model (Abraham & Förstner, 2005). 
We calibrated the panorama camera in our indoor calibration 
field, which features 188 well-distributed target points on all four 
walls as well as on the ceiling (see Figure 4). We measured the 
target points with a contactless high-precision industrial 
measurement system, thus the standard deviation of a target point 
is below 0.3 mm.  
Schmeing et al. (2011) modelled Ladybug3 panorama camera 
images as virtual spherical images. Rau et al. (2016) applied the 
same model to the Ladybug5 panorama camera as described in 
the Ladybug5 technical reference (FLIR Inc., 2017). Since we 
model each single camera head separately, no stitching error will 
occur. Our camera calibration investigations showed that the 
equidistant projection model (Abraham & Förstner, 2005) fits 
best for the Ladybug5 camera heads; therefore we apply the 
equidistant projection model to all individual camera heads. We 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
16


estimated the following IOPs: focal length, principal point, two 
radial and two tangential distortion parameters. We performed 
the calibration with five image epochs at different locations and 
with 1010 image observations. The mean reprojection error 
amounts to 0.23 pixel. Table 5 shows the standard deviations of 
the ROPs between the first camera head and each other camera 
head. 
 

Figure 4. Indoor calibration field for multi-camera 

configurations of mobile mapping systems 
 
Camera 
head 

X 
[mm] 

Y 
[mm] 

Z 
[mm] 

ω  
[deg] 

φ  
[deg] 

κ  
[deg] 

cam1 0.27 0.18 0.34 0.033 0.023 0.030 
cam2 0.20 0.17 0.38 0.016 0.022 0.007 
cam3 0.24 0.17 0.40 0.017 0.023 0.009 
cam4 0.29 0.18 0.34 0.033 0.022 0.029 
cam5 0.34 0.62 0.46 0.016 0.010 0.016 
Mean 0.27 0.27 0.39 0.023 0.020 0.018 

Table 5. Standard deviations of the calibrated ROPs between the 
first camera head (cam0) and each other camera head  

(cam1-cam5) 
 
 
5. DATA ACQUISITION AND PROCESSING 

Our prototypic MMS initializes completely indoors. The starting 
position of the LiDAR SLAM defines the origin of a local 
coordinate frame. The LiDAR SLAM algorithm computes real 
time sensor poses, which enable geometrically constrained 
camera triggering. We have implemented a time-based, a 
distance-based and a horizontal angle-based camera trigger 
constraint. In addition, the LiDAR SLAM provides a map 
overview containing the trajectory as well as the captured 
environment (see Figure 5). When data acquisition is completed, 
the LiDAR SLAM exports the so-called Cartographer state, 
which inherits the optimized sensor trajectory. Further data 
acquisition results are the ROS bag file with LiDAR and IMU 
messages (see section 4.3), the raw images from the panorama 
camera as well as a list containing the camera trigger events with 
their corresponding timestamps. 
In order to perform image measurements, we require undistorted 
images and their respective exterior orientations. Figure 6 gives 
an overview of our entire post-processing workflow. 

First, we undistort the raw images using the calibrated IOPs. 
Applying the calibrated IOPs will correct the distortions to the 
equidistant camera model. Abraham & Förstner (2005) provide 
the formulas for the conversion from the equidistant to the 
perspective camera model. We apply this conversion in a further 
step directly to the image measurements. 
Second, in order to obtain the corresponding exterior image 
orientations (EOPs) based on LiDAR SLAM, we extract the 
trajectory from the Cartographer state. For this, we extended the 
Google Cartographer with a batch function, which writes the final 
trajectory as a text file, by performing Google Cartographer’s 
assets writer. In a next step, we perform time-based interpolation 
of the image events between the trajectory events. We conduct a 
linear interpolation for the positions and a spherical linear 
interpolation based on quaternions for the orientations. Finally, 
we transform the interpolated positions and orientations with the 
calibrated ROPs and the manually measured boresight alignment 
parameters from the body frame to the frame of each camera 
head. 
 

Figure 5. Screenshots of the data acquisition software. Top: 
Cartographer SLAM preview, Bottom: Image preview and 

conditional trigger settings 
 
In application scenarios with high accuracy requirements, we aim 
to improve the LiDAR SLAM-based EOPs with a subsequent 
image-based georeferencing. For this purpose, we introduce the 
undistorted images into the incremental structure-from-motion 
(SfM) tool COLMAP (Schönberger & Frahm, 2016), which 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
17


Cavegn et al. (2018) extended with georeferencing capabilities 
by integrating prior EOPs and exploiting ROP constraints. 
Hence, we use LiDAR SLAM-based EOPs of the first camera 
(cam0) as initial values, and fix the ROPs of the other camera 
heads with the pre-calibrated values. The complete process of 
integrated georeferencing based on COLMAP is described in 
detail by Cavegn et al. (2018). Good lighting conditions and a 
structurally rich environment for proper and well-distributed 
feature detection are essential. 
 

Figure 6. Flow chart indicating our data post-processing 

workflow 
 
 
6. PERFORMANCE EVALUATION 

In order to compare the performance of LiDAR SLAM-based 
image poses and subsequently improved poses using image-
based georeferencing, we carried out investigations in an indoor 
environment. 
 
6.1 Test Site and Data Acquisition 

Our indoor study area is located on the sixth floor of the main 
campus building of the University of Applied Sciences and Arts 
Northwestern Switzerland (FHNW) in Muttenz close to Basel. It 
features a hallway with a dimension of ca. 27 m x 24 m, that leads 
to several offices, computer rooms, lecture rooms, staircases and 
elevators. The typical corridor width is approximately 3 m, with 
a minimum of about 2 m (left part in Figure 7) and a maximum 
of 4 m (right part, no. 3 in Figure 7). We determined 3D 
coordinates of many reference points representing natural 
markings e.g. on doorframes, elevator and room corners by 
tachymetry. These reference points have an accuracy of 
approximately 5 mm. We used several of them for our 
investigations as ground control points (GCPs) or check points 
(CPs). 
We started the data acquisition at the origin of the local geodetic 
coordinate frame marked with a green diamond on 21.03.2018 
10:50. The acquisition required 24 minutes. The camera trigger 
constraints were set to a distance interval of 1 m and an azimuth 
change of 15 degrees. In total, we captured a set of 1518 images 
with the panorama camera corresponding to six images at 253 
different locations, indicated as blue filled circles in Figure 7. 
 

6.2 Image-Based Georeferencing 

We defined the ROP configuration as well as its corresponding 
calibrated values in a JSON file, where the backward facing 
camera head cam0 serves as master, and fixed these ROP for the 
subsequent processing. Since the images from the upward facing 
camera head cam5 predominantly contain homogeneous surfaces 
leading to few feature correspondences, we only processed 
images from the horizontal pointing camera heads cam0-cam4 
captured at 253 locations. COLMAP exploited previously 
computed EOP from online LiDAR SLAM, used the simple 
radial fisheye camera model and performed no refinement of the 
interior orientation parameters. 
COLMAP extracted DSP-SIFT features and carried out spatial 
feature matching with a maximum distance of 10 m, 200 
maximum neighbours, a maximum angle constraint of 100 
degrees and a maximum ratio of 0.9. This resulted in 1115 
registered images, 92'597 points, 779'782 observations, a mean 
track length of 8.4, 699.4 mean observations per image and a 
mean reprojection error of 0.79 pixel. These values are similar to 
the results obtained from a different mapping at the same location 
on 27.11.2017 (Cavegn et al., 2018). 
 

Figure 7. Floor base map with overlaid projection centers of 

camera head cam0 (Trajectory), ground control points (GCPs) 
and check points (CPs) in the local geodetic coordinate system. 
At dark red point symbols, we measured both points at the top 

and the bottom of a doorframe. 
 
6.3 Precision Analysis 

In order to perform a precision analysis, we determined 36 points 
in total and used 28 of them as CPs and 8 as GCPs (see Figure 7). 
We measured each point in a minimum of four different images, 
performed a bundle adjustment-based forward intersection with 
a Python tool and estimated the 3D point coordinates. In a first 
run, we used the LiDAR SLAM-based image poses and in the 
second run, we utilized the improved image poses from image-
based georeferencing to perform the forward intersection. The 
mean precision of the forward intersection with SLAM-based 
image poses amounts to 11.6 cm. In contrast, with image poses 
improved by image-based georeferencing, the mean precision of 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
18


the forward intersection improves to 0.3 cm. Table 6 shows the 
mean precisions for GCPs and CPs respectively. This precision 
is a good indicator for the relative accuracy of typical 3D distance 
and area measurement tasks, which could be carried out 
irrespective of the presence of ground control points. 
Furthermore, we transformed the resulting 3D points with eight 
GCPs (see Figure 7) into the ground truth coordinate frame in 
order to perform the accuracy evaluation. Thus, we estimated 
three translation parameters as well as three rotation parameters. 
We compared the transformed 3D points of both sensor 
orientation approaches separately with the ground truth from 
tachymetry. The RMSE of 3D point residuals measured with 
LiDAR SLAM-based image poses amounts to 13.3 cm. In 
contrast, the RMSE of 3D point residuals measured with image-
based georeferenced image poses is 1.8 cm. The results of image-
based georeferencing representing the absolute 3D measurement 
accuracy are in the same order as in Cavegn et al. (2018). 
 

 LiDAR SLAM-
based poses 

Improved poses by 
image-based 

georeferencing 
GCPs CPs GCPs CPs 

Precision [cm] 8.2 12.3 0.2 0.3 
Accuracy [cm] 10.6 13.3 1.3 1.8 

Table 6. Precision and accuracy values for ground control points 
and check points from both LiDAR SLAM-based image poses 
and image-based georeferencing. Precision indicates the mean 
RMSE of forward intersection of single point measurements. 

Accuracy shows the RMSE of residuals to tachymetry. 
 
The precision as well as the accuracy values of the 3D point 
measurements based on the LiDAR SLAM image poses are in 
the same order. Thus, the LiDAR SLAM image poses are suitable 
e.g. for point object digitalization in the decimeter range. By 
performing a subsequent image-based georeferencing, we 
improved the accuracies by an order of magnitude. The results of 
measured 3D points with the improved image poses show that 
relative measurements in the sub-centimeter range and absolute 
measurements in the centimeter range are achievable. It should 
be noted that image-based measurements are far more intuitive 
than measurements in 3D point clouds. They also require very 
little training – in contrast to measurements in 3D point clouds, 
which typically require expert skills. 
 
 
7. CONCLUSIONS AND OUTLOOK 

We successfully developed a portable mobile mapping system for 
efficient image-based infrastructure management. Our research 
platform and working prototype features a multi-head panorama 
camera for the actual image acquisition as well as two multi-
profile LiDAR scanners (horizontal and vertical) and a MEMS-
based industrial grade IMU for LiDAR-based SLAM. Our 
modular design allows straightforward adaptations of the sensor 
configuration and of the acquisition software based on the Robot 
Operating System (ROS). We also provided details on the 
calibration of the multi-head panorama camera Ladybug5 using 
the equidistant fisheye camera model. We furthermore 
implemented and presented a workflow for indoor data 
acquisition without a need for outdoor initialization. The 
workflow utilizes LiDAR-based SLAM for online mapping and 
progress monitoring as well as for post-processing purposes. 
Furthermore, we improved the camera poses with subsequent 
image-based georeferencing using relative orientation 
constraints.  

Our accuracy investigation delivered absolute 3D point 
accuracies in the range of 10.6 to 13.3 cm using image poses 
directly from LiDAR SLAM. Thus, using image poses from 
direct georeferencing based on LiDAR SLAM, 3D point 
coordinates can be determined with an accuracy at the decimeter 
level. By performing our image-based georeferencing, an 
accuracy improvement by an order of magnitude can be expected. 
Our investigations show absolute 3D point accuracies in the 
range of 1.3 to 1.8 cm. The excellent 3D point measurement 
precision of 0.2-0.3 cm, obtained after image-based 
georeferencing, indicates that the final 3D services will provide 
relative accuracies for typical measurement tasks well within the 
sub-centimeter level. 

In future work we will tackle the overall system calibration, i.e. 
the precise alignment of the laser scanners and the panorama 
camera to the IMU. Moreover, we plan to assemble additional 
cameras that will allow investigations with fixed stereo bases, 
presumably leading to more accurate image poses especially in 
narrow passages. Furthermore, we will fuse the IMU data as well 
as the trajectory points from LiDAR SLAM within an Extended 
Kalman Filter (EKF). Since we expect an accuracy increase of 
EKF-based camera poses, this method could be an alternative to 
the image-based georeferencing. We expect that this approach 
should perform well also in indoor environments with poor 
structure and different lighting conditions. 
 
 
ACKNOWLEDGEMENTS 

This work was co-funded by the Swiss Innovation Agency 
(Innosuisse, formerly CTI) as part of the BIMAGE project (No. 
18493.2 PFES-ES) in cooperation with iNovitas AG (Baden-
Dättwil, Switzerland). 
 
 
REFERENCES 

Abraham, S., & Förstner, W., 2005. Fish-eye-stereo calibration 
and epipolar rectification. ISPRS J. Photogramm. Remote Sens., 
59(5), pp. 278–288.  
 
Blaser, S., Nebiker, S., & Cavegn, S., 2017. System Design, 
Calibration and Performance Analysis of a Novel 360° Stereo 
Panoramic Mobile Mapping System. In: ISPRS Ann. 
Photogramm. Remote Sens. Spatial Inf. Sci., Hannover, 
Germany, Vol. IV-1/W1, pp. 207–213. 
 
Blaser, S., Nebiker, S., & Cavegn, S., 2018. On a Novel 360° 
Panoramic Stereo Mobile Mapping System. Photogramm. Eng. 
& Remote Sens., 84(6), pp. 347–356. 
 
Burkhard, J., Cavegn, S., Barmettler, A., & Nebiker, S., 2012. 
Stereovision Mobile Mapping : System Design and Performance 
Evaluation. In: Int. Arch. Photogramm. Remote Sens. Spatial Inf. 
Sci., Melbourne, Australia, Vol. XXXIX, Part B5, pp 453–458. 
 
Cavegn, S., Nebiker, S., & Haala, N., 2016. A systematic 
comparison of direct and image-based georeferencing in 
challenging urban areas. In: Int. Arch. of the Photogramm. 
Remote Sens. Spatial Inf. Sci., Prague, Czech Republic, Vol. XLI, 
Part B1, pp. 529–536. 
 
Cavegn, S., & Haala, N., 2016. Image-Based Mobile Mapping 
for 3D Urban Data Capture. Photogramm. Eng. & Remote Sens., 
82(12), pp. 925–933. 
 
 
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
19


Cavegn, S., Blaser, S., Nebiker, S., & Haala, N., 2018. Robust 
and Accurate Image-Based Georeferencing Exploiting Relative 
Orientation Constraints. In: ISPRS Ann. Photogramm. Remote 
Sens. Spatial Inf. Sci., Riva del Garda, Italy, Vol. IV-2, pp 57–
64. 
 
Colas, F., 2016. XSens Driver, ROS Wiki 
http://wiki.ros.org/xsens_driver (3 April 2018). 
 
FLIR Inc., 2017. FLIR Ladybug5 USB3, Technical Reference. 
Richmond, BC. 
https://www.ptgrey.com/support/downloads/10128  
(3 April 2018). 
 
Glennie, C. L., Kusari, A., & Facchin, A., 2016. Calibration and 
stability analysis of the VLP-16 laser scanner. In: Int. Arch. 
Photogramm. Remote Sens. Spatial Inf. Sci., Lausanne, 
Switzerland, Vol. XL-3/W4, pp. 55–60. 
 
Hess, W., Kohler, D., Rapp, H., & Andor, D., 2016. Real-Time 
Loop Closure in 2D LIDAR SLAM. In: IEEE International 
Conference on Robotics and Automation (ICRA), Stockholm, 
Sweden, pp. 1271–1278. 
 
Kersting, A. P., Habib, A., & Rau, J.-Y., 2012. New Method for 
the Calibration of Multi-Camera Mobile Mapping Systems. In: 
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., 
Melbourne, Australia, Vol. XXXIX, Part B1, pp. 121–126. 
 
Kooi, B., 2014. MTi User Manual, MTi 10-series and MTi 100-
series. http://www.farnell.com/datasheets/1935846.pdf  
(3 April 2018). 
 
Lehtola, V.V., Kaartinen, H., Nüchter, A., Kaijaluoto, R., Kukko, 
A., Litkey, P., Honkavaara, E., Rosnell, T., Vaaja, M.T., 
Virtanen, J.-P., Kurkela, M., El Issaoui, A., Zhu, L., Jaakkola, A. 
& Hyyppä, J., 2017. Comparison of the Selected State-Of-The-
Art 3D Indoor Scanning and Point Cloud Generation Methods. 
Remote Sens. 9(8), https://doi.org/10.3390/rs9080796 
 
Leica Geosystems, 2018. Leica Pegasus Backpack Wearable 
Mobile Mapping Solution. 
https://leica-geosystems.com/products/mobile-sensor-
platforms/capture-platforms/leica-pegasus-backpack  
(3 April 2018). 
 
Meilland, M., Comport, A. I., & Rives, P., 2015. Dense 
Omnidirectional RGB-D Mapping of Large-scale Outdoor 
Environments for Real-time Localization and Autonomous 
Navigation. J. F. Robot., 32(4), pp. 474–503. 
 
Nebiker, S., Cavegn, S., & Loesch, B., 2015. Cloud-Based 
Geospatial 3D Image Spaces—A Powerful Urban Model for the 
Smart City. ISPRS Int. J. Geo-Information, 4(4), pp. 2267–2291. 
 
Novak, K., 1991. The Ohio State University Highway Mapping 
System: The Stereo Vision System Component. In: Proc. of the 
47th Annual Meeting of the Institute of Navigation, 
Williamsburg, VA, pp. 121–124. 
 
Nüchter, A., Borrmann, D., Koch, P., Kühn, M., & May, S., 2015. 
A Man-Portable, Imu-Free Mobile Mapping System. In: Int. Ann. 
Photogramm. Remote Sens. Spat. Inf. Sci., La Grande Motte, 
France, Vol. II-3/W5, pp. 17–23. 
 
 
Prime Computer AG, 2017. PrimeMini Pro Data sheet. 
https://primecomputer.ch/wp-
content/uploads/2017/09/Produktdatenblatt_PrimeMiniPro_all_
configurations_neu.pdf (3 April 2018). 
 
Puente, I., González-Jorge, H., Martínez-Sánchez, J., & Arias, P., 
2013. Review of mobile mapping and surveying technologies. 
Measurement, 46(7), pp. 2127–2145. 
 
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, 
J., Berger, E., Wheeler, R., Ng, A., 2009. ROS: an open-source 
Robot Operating System. ICRA workshop on open source 
software, 3(3.2), p. 5. 
 
Rau, J. Y., Su, B. W., Hsiao, K. W., & Jhan, J. P., 2016. 
Systematic calibration for a backpacked spherical 
photogrammetry imaging system. In: Int. Arch. Photogramm. 
Remote Sens. Spatial Inf. Sci., Prague, Czech Republic, Vol. XLI, 
Part B1, pp. 695–702. 
 
Rockey, C., & Purvis, M., 2015. Pointgrey Camera Driver. ROS 
Wiki http://wiki.ros.org/pointgrey_camera_driver  
(3 April 2018). 
 
Schmeing, B., Laebe, T., & Förstner, W., 2011. Trajectory 
Reconstruction Using Long Sequences of Digital Images from an 
Omnidirectional Camera. In: DGPF Tagungsband 20/2011, 
Mainz, Germany, pp. 443–452.  
 
Schönberger, J. L., & Frahm, J.-M., 2016. Structure-from-
Motion Revisited. In: IEEE Conference on Computer Vision and 
Pattern Recognition (CVPR), Las Vegas, USA, pp. 4104–4113. 
 
Schwarz, K. P., Martell, H. E., El-Sheimy, N., Li, R., Chapman, 
M. A., & Cosandier, D., 1993. VIASAT - A mobile highway 
survey system of high accuracy. In: Proceedings of the Vehicle 
Navigation and Information Systems Conference, Ottawa, 
Canada, pp. 476–481. 
 
Stachniss, C., Leonard, J. J., & Thrun, S., 2016. Simultaneous 
Localization and Mapping. Springer Handbook of Robotics, pp. 
1153–1176. 
 
Szwarc, H., 2017. Increasing information in building models. 
GIM International, 31(5), pp. 31–33. 
 
Thomson, C., Apostolopoulos, G., Backes, D., & Boehm, J., 
2013. Mobile Laser Scanning for Indoor Modelling. In: Int. Ann. 
Photogramm. Remote Sens. Spat. Inf. Sci., Antalya, Turkey, Vol. 
II-5/W2, pp. 289–293. 
 
Thrun, S., 2002. Robotic mapping: A survey. Exploring Artificial 
Intelligence in the New Millennium, 1, pp. 1–35. 
 
Velodyne, 2016. User’s Manual and Programming Guide, VLP-
16 Velodyne LiDAR Puck. 
http://velodynelidar.com/docs/manuals/VLP-16 User Manual 
and Programming Guide 63-9243 Rev A.pdf (3 April 2018). 
 
Vexcel, 2018. UltraCam Panther. http://www.vexcel-
imaging.com/products__trashed/ultracam-panther/  
(3 April 2018). 
 
Withley, J., 2016. Velodyne Driver. ROS Wiki 
http://wiki.ros.org/velodyne_driver (3 April 2018). 
 

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-1, 2018 
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 
https://doi.org/10.5194/isprs-annals-IV-1-13-2018 | © Authors 2018. CC BY 4.0 License.

 
20

http://wiki.ros.org/xsens_driver%20(3

	DEVELOPMENT OF A PORTABLE HIGH PERFORMANCE MOBILE MAPPING SYSTEM USING THE ROBOT OPERATING SYSTEM
	1. Introduction
	2. System Requirements
	3. System components
	3.1 Panorama Camera
	3.2 Laser Scanner
	3.3 Inertial Measurement Unit
	3.4 Computer
	3.5 Flashlights

	4. System configuration and camera calibration
	4.1 Mechanical Design
	4.2 Electronic Design
	4.3 Software Design
	4.4 Camera Calibration

	5. Data acquisition and processing
	6. performance evaluation
	6.1 Test Site and Data Acquisition
	6.2 Image-Based Georeferencing
	6.3 Precision Analysis

	7. conclusions and outlook
	Acknowledgements
	References