To clearly highlight the current problems, as well as the requirements and potential benefits of the proposed idea, in Section 2 , we start with examples of scenarios combining IoT with AR. Section 3 Section 5 and Section 6 continue with a survey of these three key areas. We summarize the results and discuss future directions in the survey of these areas, as well as open issues for realizing a future augmented smart living environment. Finally, in Section 7 , we provide some concluding remarks and a summary of our main findings and contributions.

To further promote this idea, that is, the synergistic marriage of AR and IoT, this paper surveys the current state of AR services in terms of their scalability and examines notable approaches to their integration into the IoT framework as a control interface. Here, the issue of scalability refers to the number of objects that can be supported without significant latency. Note that AR system performance depends largely on data management and object recognition and/or tracking. Additionally, to use AR as an interface with IoT, we examine previous studies that showed how AR can offer an intuitive and natural method to communicate with IoT objects, compared to other methods (such as using a graphical user interface (GUI) with no visual, contextual, or spatial registration). We believe that data management, object-guided tracking, and the interface design are key components for the development of an efficient IoT–AR infrastructure [ 11 ], and we focus on these three areas. To find out the current research status and future directions, first, we summarized previous works (See Table 1 for details).

AR and IoT might have different objectives with seemingly unrelated concepts, but they can, in fact, be complementary to each other [ 11 12 ]. First, AR offers a convenient and intuitive way for users to visualize and interact with IoT objects and their associated data. Principally, a spatially registered and visually augmented interface offers a direct and semi-tangible interface and is, thus, easy to comprehend and highly useful, particularly for everyday and/or anywhere usage [ 11 ]. The AR client, to wear a mobile or helmet-typed device, is capable of instantly connecting to an (IoT) product, receive relevant object specific data, control information and associated AR datasets for the given targeted service, then understand the state (or how to operate) with current datasets from the IoT product and interact with the physical object using direct control by natural interaction [ 11 ], for example. Note that semi-tangible means the interface that can be visualized and operated in a real-time way using the augmented virtual content to connect IoT objects in the real world [ 13 ]. Conversely, for AR, IoT as an infrastructure for “everywhere” service offers an efficient way to make AR “scalable” to the same degree by handling the necessary data management (for example, tracking data and content) in a distributed and object-centric fashion [ 1 ]. Thus, any IoT product can be accessed on the spot locally with the seamless manner, and the scalable interface allowed for location-based geographical and augmented reality services using AR clients [ 11 ]. Additionally, context-aware AR services are made possible by using and tapping into the more refined environment information made available by the IoT infrastructure [ 1 ].

Recently, augmented reality (AR) and the Internet of Things (IoT) have received significant attention as key enabling technologies for making spaces smarter and more interactive [ 1 4 ]. AR is a type of interactive medium that provides a view of the real world augmented by spatially registering useful computer-generated information. It helps people to understand the world and amplifies their intelligence for solving problems and carrying out actual tasks [ 5 6 ]. IoT refers to a network of physical devices and everyday objects embedded with minimal computing elements for sensing, collecting, communicating, and even interacting with the objects themselves. Such an infrastructure will provide the basis for smart environments through a collective big data analysis and context-based services (e.g., real-time analytics and automation) [ 7 10 ].

The final scenario is about augmented reality interaction. As an example of physical objects in everyday life, Richard is using his AR remote control system to turn on small appliances in his home [ 13 24 ] (see Figure 2 a). He is in a spacious living room and decides to turn on a light that had been moved away from the TV yesterday, and he can control the selected light with the superimposed GUI button by operating the attached actuator. Note that AR interaction for home appliances refers to device control to exchange with the attached sensors and actuators in the real space. The techniques express the situations where the user can handle direct interactivity in terms of the physical devices. When the AR user stands near the television, for example, he can immediately obtain lots of useful status information from IoT sensors and directly control the TV to have desired control input using AR visual information [ 1 ] (see Figure 2 b). Figure 2 shows an example of future AR control for IoT home appliances. Briefly, the AR user can execute direct interactions to manipulate all objects in the surrounding world.

As another futuristic scenario in terms of AR tracking to connect IoT sensors, Alice wants to study the functionalities of her new TV and she is playing the AR manual for the TV. The AR manual provides step-by-step instructions to learn operations, and visual AR contents of next operation in the TV’s parts are registered with AR tracking information. Although Alice has prebuilt feature datasets, she cannot have robust AR tracking performance due to small feature points of the textureless TV. Thus, she pushes a graphical user interface (GUI) button for guided AR tracking, then she is able to execute robust registration selected by the TV’s tracking characteristic, which is the stored object’s tracking attribute in the IoT sensor. According to the described scenario, in the near future we will be able to apply object-optimized AR tracking to consider each object’s characteristics (e.g., rich texture or textureless).

John is on a business trip to New York. He feels a little chilly in the hotel and tries to raise the temperature of his room, however, he cannot find the thermostat. Using a standard AR enabled IoT service mobile app, he finds and connects to the IoT thermostat (using near-field communication such as Bluetooth), and the app shows an AR-based control interface with which John easily adjusts the temperature, without having to fiddle with the actual device or call the front desk for help ( Figure 1 ). This scenario illustrates how AR services can be scaled to “everywhere” IoT-enabled objects. The AR client can connect to any IoT object using the assumed standard protocols through local peer-to-peer communication without having to use the central server through the Internet. Concerning the latter case, the central server would have to manage millions and millions of objects, with significant performance latency [ 11 ].

Additionally, the AR system can track the relationship between a physical object and tracking information such as 3D features from images. Next, the user has the capability, in terms of efficient operation of IoT objects, to look through it with helpful AR content associated with the IoT environment. This is based on the help of both the IoT communication capability and the graphical AR environment. Then, any valuable application and service to connect everyday IoT objects and provide the user’s AR experience, such as an AR manual, training, control, and instructions, offers comprehensive platforms for ways that users can access and interact with mobile AR that cover everything and everyone to engage with the physical space [ 15 ]. Note that this typical approach can be realized using the filtering method that nearby AR clients to an IoT object can identify and augment the candidate target object to obtain contextual datasets [ 11 ].

Here, we allude to three key components: distributed AR data management; object-guided tracking; context-based augmented reality (AR) interaction, and the resulting advantages of combining Internet of Things (IoT) and AR. Figure 3 shows a possible architecture of such combination as a basis for smart and interactive AR services. The AR service client interacts directly with the IoT object of interest in the immediate area and, upon connection, immediately receives context-relevant AR datasets (for tracking or customized service content) [ 1 ]. Depending on the context, appropriate and available services, such as a simple product information display, appliance control, or an instruction manual, are shown in the proper form (for example, through a mobile graphical user interface (GUI), AR glasses, mobile AR, voice, spatially registered AR, or simple overlay) with which to interact.

4. Data Management for Physical Objects

Augmented reality (AR) frameworks and services along with dataset management for everyday objects are described briefly in this section. Additionally, several architectures, data processes, data structures, and content representation for the physical objects to interact with AR in the published works are given. Internet of Things (IoT) and AR services commonly need to manage generic data and service content for their constituent objects or augmentation targets, respectively, which are physical everyday objects. Through having an IoT object communicate its own control interface information, the AR client can be configured to invoke certain control operations through AR interaction. Thus, the AR client has essential information to be contained in the IoT object, for example, features for AR recognition and tracking, generic content and information about the object itself, control interface, and organized additional contents for the operation. Note that the information exchange between the AR client and the IoT objects can occur directly between them or through the regional IoT server [ 1 ]. Herein, we review current approaches to managing such physical object data for AR use (for example, architecture and data handling) and discuss how they can be extended and scaled to the level of IoT [ 11 ].

To interact with real objects using AR interfaces in the early days, people were concerned with an AR framework that was capable of executing ubiquitous communication between physical objects and the AR device [ 31 ]. Recent works have attempted to deal with mapping the sensor–object relationship and filtering approaches (to reduce the search space) in the near space [ 1 ]. Specifically, in one notable framework with respect to AR, Iglesias et al. suggested an intelligent selection of resources by the user’s attributes, user–object proximity, relative orientation, resource visibility, and AR interaction connecting the object [ 32 ]. They developed an object browser based on AR with context-aware representation of resources. Additionally, Ajanki et al. constructed an augmented reality interface with contextual information and defined context-sensitive virtual information about people, offices, and artifacts [ 33 ]. Then, they suggested a filtered AR concept for finding out teaching and research project information to help a visitor to a university department.

Figure 6 shows the future process flow for AR frameworks, based on the physical objects, considering AR datasets with scalability. An AR user interested in AR service can retrieve artificial markers (or natural features) from surrounding objects; it is necessary for the relationship between the physical and virtual object IDs to be defined in advance. Then, the user, holding the mobile device, is able to visualize the filtered AR objects, which are nearby IoT-capable objects based on their relative distance or direction from the user, in new places. Then, the client AR system directly receives the “feature” and “content” information for each object with the attached sensor. Next, the AR user can experience an efficient AR environment (e.g., IoT control interface) to mix the virtual object by donning a video see-through head-mounted display (HMD) or using a mobile phone with an attached camera module [ 1 ].

Many studies have been interested in augmented reality issues based on the cloud as the computing resource [ 34 ]. These works showed the benefit of computation time to reduce the heavy work by matching the large quantity of features for the poor computing capability of mobile devices. Additionally, recent works have focused on the process to register and manage AR datasets with a cloud computing device [ 34 ]. These works are concerned with how to fit tracking information and AR presentation datasets from the remote server. To access the surrounding objects, for example, the user can receive the collected AR attributes and tracking information that are shared via the server. Then, the AR user, with only the AR browser program without tracking information and AR presentation datasets in the device, easily can connect all of the things with the prebuilt relationship mapping.

17, A more recent trend is to improve the way to use the adjacent computing resources in the user’s surroundings, rather than enabling computing services on the end of the network at a long distance, such as a remote server or the cloud [ 1 ]. It will still be difficult for cloud services to support scalability to the level of “everywhere.” An alternative may be to connect to a single area server (serving only a particular local area, such as a single home) managing only a limited number of objects [ 34 ]. The adjacent computing approach can be used to solve problems such as a bottleneck assignment and detection of moving objects by a remote server. Note that this approach is similar to fog computing architecture in the domain of a sensor network that emphasizes latency reduction with high-quality service and handles datasets at the network edge [ 16 35 ]. The AR user can connect directly to the objects in the surrounding area because the sensors attached to the object can detect it in real time to consider the user’s position in certain ranges [ 17 ].

Some research has proposed details of the content structure for resource management as another issue to provide AR datasets. Kim et al. presented an AR content structure for building mobile AR applications in HTML5, as on the Web [ 28 ]. They used an extended presentation logic of HTML to apply current web architecture and a referencing method with matching between physical and virtual resources. To validate their process, they augmented a physical globe with sensor data fed from physical weather sensor stations (see Figure 7 ). Additionally, as a similar AR data structure, Muller et al. introduced a custom XML-based format to define AR manual structures for home appliances [ 36 ].

The situation is similar with IoT services and content. Most IoT services are implemented as applications. One promising direction is to use the Web to support interactions with physical objects, as exemplified by Google’s Physical Web [ 37 ]. Here, objects have URLs and can exhibit their own dynamic and cross-platform contents, represented in standard languages such as HTML and JavaScript. Thus, we can envision a future where various types of IoT services, including even AR, will be available under a unified Web framework, that is, the “webization” of things. Ahn and co-workers, for example, presented a content structure as an extension to HTML5 for building webized mobile AR applications [ 18 ]. This allows a physical object to be referenced and associated with its virtual counterpart. Figure 7 shows an example of associating a globe (physical object) with virtual objects (augmentation).

Additionally, we should consider the characteristics of AR contents according to IoT devices because there are many different types of IoT devices. Note that this is a similar concept that the website components have the different configuration in mobile and desktop computing devices. Thus, to make the IoT enabled AR platform to be naturally applied everywhere, depending on the nature of the IoT device, it should adaptively control the degree of AR content representation.