VISION-BASED ACTION RECOGNITION IN THE INTERNAL CONSTRUCTION SITE USING INTERACTIONS BETWEEN WORKER ACTIONS AND CONSTRUCTION OBJECTS

ISARC

J. Y. Kim and *C. H. Caldas

University of Texas at Austin

301 East Dean Keeton

Austin, TX, 78712, USA

(*Corresponding author: caldas@mail.utexas.edu)

This paper presents a novel action recognition method for observing human workers using interactions between actions and related objects on an internal construction site. This method can be used to measure work rates for labour productivity monitoring. This monitoring is critical because the performance of a construction project is significantly impacted by labour productivity. However, construction sites are generally crowded with a large number of workers and objects. Such congestion disrupts the accurate, automatic recognition of construction workers’ actions. This congestion is one reason that existing automatic action recognition studies of construction areas mainly focus on workers’ actions themselves. However, the crowded conditions mean that sites could offer a great deal of clues that could be used for automatic action recognition. According to psychological studies, interactions clearly take place between human actions and related objects, such as between hammering and a hammer. Humans use these interactions to recognize actions or objects more accurately. On the construction site, workers, materials, tools, and equipment are carefully planned out ahead of actual construction. The categories of workers and objects are pre-defined and, as noted, specific interactions define relations between worker actions and objects. In this paper, the interactions are limited to human workers and their hand-held objects. Action recognition results can be combined with hand-held object information to improve recognition accuracy. With the limited interactions, experiments in this paper show a significant improvement in action recognition. This paper describes the utilization of these interactions to improve construction action recognition accuracy based on human skeleton data and 2D color video from Microsoft KINECT sensor.
Keywords: Construction; Data; Information; Productivity; Computer; Computers; analysis; STEPS; vision;
Full Access to Technical Paper
PDF version for $20.00
Other papers from ISARC