3 Augmented Reality Frameworks for Windows Phone
Join the DZone community and get the full member experience.
Join For FreeIntroduction
Augmented reality (AR) solutions and projects are seen as
that next step for a user interface or experience that melds technology with
the real world. With the embedding
of AR camera features in devices such as the Nintendo 3DS (and you don’t even
need to buy a game to play around with it), GPS recognition systems, and audio
detection, AR is slowly becoming mainstream.
While the reality is that AR is perceived (by developers
at least) as just too difficult or cumbersome to implement, thanks to the
developer community, this just isn’t the case. This article walks through what features are available with AR, explains terminology, and provides a high-level
comparison of some of the premiere frameworks for use on Windows Phone.
Newcomers: Download the Windows Phone Toolkit for Free!
Descriptions of types
This section details the three core concepts of AR where we take input from the real world and merge that into the app experience.
· 3D space
AR frameworks
need to understand the 3D space of the world you are viewing and potentially
manipulating. These principles are the same for 3D games.
With 3D space
types of AR, you are applying a 3D representation of what you are displaying
proportional to the view through a normal camera based on either the axis that
the device is being held or the position of objects in the real world. This could be deforming the video image
or simply overlaying an object or image on top of the video. Interaction is important, whether manipulating
the 3D view or just interacting with it.
3D space is
important because you are trying to meld a 3D view through a camera with an
artificial 3D view generated by the app or game. In most cases, this is
achieved by just AR markers or special or feature recognition in advanced cases,
which is described in the next section.
Knowing where you are and what you are looking at in relation to what
you are intending to display is key.
·
Video recognition


A more advanced
AR system recognizes what it reads
from the video input by depth / motion or definition.
The simplest recognition
method is to use some kind of fixed image (commonly referred to as an AR card /
marker), which varies from a QR code image or strongly colored image like the
3DS AR cards. A definitive image
is key so the software easily recognizes where in the camera view the image is
and what its current orientation is.
More advanced systems
can actually recognize the world without the use of such cards. These systems
can recognize features such as faces (as commonly used in modern-day cameras)
and also depth (usually with the aid of IR depth reading cameras, like the
Kinect). However, these systems generally require complex algorithms and, more
importantly, power—both CPU and electric, both of which are limited on
most mobile form factors. In many
cases, this level of AR isn’t needed and can be achieved with the simpler solution
using fixed images.
One of the more
fascinating projects I’ve seen using this approach is the display of the final
construction of a shopping center on a handheld device / tablet. This display
allows you to virtually walk around the site and see the constructed building
before it’s even built.
·
Geo position
Another slant on
the AR world is an object’s exact location in relation to where the camera or
person is in the real world. This uses the GPS coordinate of the object in
relation to the GPS coordinate of the user and the direction he or she is
facing. Most applications that employ this create a collection of places (like
restaurants and gas stations) and display icons or images when the camera is facingthe
user’s general direction. Implementations also use nearby surroundings to alter
the view or game world as the player is using it. For example, a Hangman-style game may search for points of
interest near the player and then offer up words relating to them (e.g., food
items, if near a supermarket, or sports celebrities if near a sports stadium).
The Toolkits
When deciding to implement AR into your app
or game (or when considering the design of your app/game), it’s important to
choose a framework that is best geared to assist you to meet your goals.
When looking around there are several AR
Toolkits available and if you browse through the App Hub catalogue, you’ll find
at least one AR sample there as well. However, the AppHub sample is more an
exploration of the Motion API (one of the best) rather than AR. Even with its
simple use of the camera feed, it is just displaying two textures relational to
the position of the device (the frame of the video and the 3D overlay).
Each of the toolkits described below has
unique abilities aimed at solving certain issues or providing unique features.
The following introspection provides a high-level description of each of the top
three AR toolkits. There is no doubt there are many more out there but these
make up the cream of the crop for simplicity and ease of use.
· Silverlight AR Toolkit (SLAR)
http://slartoolkit.codeplex.com
Overview
On its website, http://slartookit.codeplex.com/, SLAR
Toolkit is described as a flexible AR library for Silverlight and Windows Phone
with the aim to make real-time AR applications with Silverlight as easy and
fast as possible. It can be used with Silverlight's Webcam API (or with any other
CaptureSource), WriteableBitmap, or Windows Phone's PhotoCamera.
SLAR Tookit is based on the established ARToolkit (http://www.hitl.washington.edu/artoolkit) and NyARToolkit (http://www.artoolworks.com/products/desk-top/nyartoolkit ) which use a dual license mode. It can be used for open- or closed-source applications under certain conditions.
It has implementations in Silverlight 4 / 5 (including Hardware acceleration in SL 5) and it’s available for Windows Phone.
If you are feeling
adventurous, you could also use this in a SilverXNA (the Silverlight XNA
integration in WP 7.1), potentially using a 3D rendering on top of your
Silverlight framework. A great example of this was shown off at Build using
Kinect, where QR codes were placed around a room and different interactions
were kicked off when the camera detected different codes. You could use
recognition instead of the video to kick off actions. Think beyond!
Project Examples
Here are a few images of what others have done with the SLAR toolkit:
Using the Framework
The SLAR Toolkit has one of the easiest implementations I’ve seen. It breaks down like this:
o Set up a standard video capture source.
o Initialize the AR recognition engine passing the capture source as a reference.
o Hook up to the events for the AR recognition engine.
o Start the camera capture source.
For example:
// Load AR marker from the generated marker file var markerSlar = Marker.LoadFromResource("data/Marker_SLAR_16x16segments_80width.pat", 16, 16, 80.0); // Initialize detector with the a camera capture source ArDetector = new CaptureSourceMarkerDetector(captureSource, 1, 4000, new List<Marker>{ markerSlar }); // Hook up to the marker detection event ArDetector.MarkersDetected += (s, e) => { var detectedResults = e.DetectionResults; }
All that is left
is to do something when the AR marker is detected. (The site already comes with
a pre-pared AR marker to use, but you can make your own if you wish). The
samples and documentation on the site give some basic implementation examples
that can help get you on your way.
There’s a great article here that shows this implementation on Windows Phone: http://kodierer.blogspot.co.uk/2011/05/augmented-mango-slartoolkit-for-windows.html
In my own experience of implementing SLAR in a SilverXNA project, it is important to ensure that you remember that you are melding two separate worlds. The SLAR Toolkit gives you exactly what you need to position your XNA 3D world based on what the camera is seeing, Remember, the camera can move on its own (thanks to the person holding the camera) and is no longer at the control of the game engine!
Resources
Alternatively, you can use the Balder3D engine in Silverlight for a different twist. It is compatible with Windows Phone. The SLARToolkit even has samples written for Balder! (http://balder.codeplex.com/)
If you get stuck,
there are many out there willing to help via the discussions on the codeplex
site or from its creator Rene Schulte (http://kodierer.blogspot.co.uk).[TD1]
· Goblin XNA
Overview
Goblin is a completely different beast than the SLAR Toolkit. Goblin offers a much wider range of capabilities and features right out of the box, including physics and networking support via additional open-source libraries that are cleverly integrated.
Goblin provides:
o Full support 3D scene manipulation and rendering.
o 6DOF (six-degrees-of-freedom) position and orientation tracking.
o Support of the Vuzix iWear VR920 head-worn display in monoscopic and stereoscopic modes (3D viewing).
o A 2D GUI system to allow the creation of classical 2D interaction components.
This extra complexity does come at a little extra cost in terms of brain matter. What this really comes down to is that you have to think in 5D—not just the position and orientation of the 3D scene you are drawing but also the relative position where you are drawing the scene plus the position and orientation of the camera / person holding the camera.
This gives more detailed and expansive options in what you can implement. You only have to look at some of the research projects done with Goblin, ranging from simple interactive surfaces to a full 3D island maze where 3D objects collide with real-world objects.
On the bright side,
the framework does a lot of the grunt work and math. the only limit is your
imagination.
Project Examples
Here are a few
images of what others have done with the Golbin toolkit:
Using the Framework
At the time of writing, Goblin XNA supports up to XNA 4.0. Even though there isn’t a specific Windows Phone release at the moment, the library will run under Windows Phone and there are plans to release a phone version later.
I’d recommend a solid understanding of XNA before you start down this path.
Similar to SLAR to
use the toolkit to recognise markers you would initialise Goblin as follows:
// Add this video capture device to the scene so that it can be used for // the marker tracker scene.AddVideoCaptureDevice(captureDevice); // Create an optical marker tracker that uses ARTag library tracker = new ARTagTracker(); // Set the configuration file to look for the marker specifications tracker.InitTracker(638.052f, 633.673f, captureDevice.Width, captureDevice.Height, false, "ARTag.cf"); // Set the marker tracker to use for our scene scene.MarkerTracker = tracker; // Display the camera image in the background. Note that this parameter // should be set after adding at least one video capture device to the // Scene class. scene.ShowCameraImage = true;
After that instructions follow you through composing your 3D scene and using built in physics, the toolkit takes care of the rest.
As stated not quite as easy as SLAR but very powerful once you have mastered the framework.
*Note, Goblin
won’t work “out of the box” for Windows Phone and will require a little effort
to make it work with the CameraSource used in WP, it’s not difficult as the WP
CameraSource is based on the DirectX version. At the time of writing the Goblin team were working on a
source port for WP7, check the discussion on the site for more info.
Resources
Channel 9 article explaining Goblin use end-to-end:
http://channel9.msdn.com/coding4fun/articles/Augmented-Reality-Domino-Knock-Down-Game
Goblin article using ALVAR for advanced tracking:
· GEO AR Toolkit (GART)
Overview
GART was created by Jared Bienz, a Microsoft employee living in Houston Texas. Jared helps developers build applications for Windows Phone, so if you build something cool with GART he'd love to hear about it.
The GART project describes itself as a framework that was created to help people quickly and easily build AR applications for Windows Phone.
This kit is different from other AR kits in that it enables “Geo AR.” Where other toolkits place virtual things on top of specially printed tags, this toolkit places information on top of real places in the world around you by tracking where you are and the direction you’re facing.
Geo AR apps are
easy to write because all you need to provide is a collection of objects that
have latitude and longitude points. These can come from anywhere—for
instance, a Bing restaurant search, a Flickr photo search, or a Wikipedia
article search. The framework then takes care of managing sensors and tracking
where the user is in relation to the reference points. It can show where the
points are in relation to user from a top-down perspective (on a map) or it can
show where the points are as a virtual lens into the real world.
Please note that GART
makes heavy use of the Motion APIs shipping with Windows Phone Mango (OS 7.5),
so it is recommended you have motion-enabled device to use GART (if the device
does not have a Gyroscope the motion API will attempt to compensate with what
sensors are available). This should include all devices that ship with 7.5 as
well as many of the existing 7.0 devices that have been upgraded to 7.5. The
emulator, unfortunately, does not currently support the Motion API fully.
You could view GART as an extension to the SLAR Toolkit. However, GART does not use any of the video recognition or tracking features and solely relies on the Geo data it has to work with.
The toolkit simply
functions by using the GPS on the device (or one of the other location services
available on Windows Phone) in combination with the Motion API to discover
where you are and which direction you are facing. It then discovers places near
you from the web and plots them out. If a place is in your field of view
through the camera, then a tag is displayed.
The framework is not
limited to just the camera and can be easily extended to offer additional
information about the places of interest on the device in other ways (arrows
pointing to places to the left or right, for example).
The toolkit also offers additional services to further enhance the users experience out of the box:
· HeadingIndicator – Draws a circle with a cone that rotates to show the user’s heading. Good for layering on top of a map.
· OverheadMap – Displays a Bing map that remains centered on the user’s location. Normally, the map is fixed to ‘North Up’, but it can also rotate to the user’s heading.
· VideoPreview – Essentially, a rectangle that displays video from the camera. Normally, it’s placed as the first (or lowest) layer and it’s set to fill the screen. But you can have multiple instances of this layer at different sizes and locations. For example, you could have an OverheadMap fill the screen with a small video preview in the corner.
· WorldView – Displays a virtual world in 3D space and applies matrix math to keep the virtual world aligned with what’s seen through the camera.
Note that there is a difference between facing and travelling. A person can be driving north on the freeway but looking west out the window.
Items to display
are simply sourced from a list of items which can be local (good for GeoCaching
types of projects) or from Bing (the toolkit includes Bing search
functionality).
Project Examples
Here are a few
images of what others have done with the GART toolkit:
Pinbucket Gart Sample project
Using the Framework
The basic steps
anyone would follow to build an app with the toolkit are:
· Start a new Windows Phone project
· Add an ARDisplay control to your page
· Add the views (or layers) you want as children of the ARDisplay
· In Page.OnNavigatedTo call ARDisplay.StartServices()
· In Page.OnNavigatedFrom call ARDisplay.StopServices()
· Create a collection of ARItem objects (or your own custom type that inherits from ARItem)
· Set the GeoLocation property of each ARItem to a location in the real world
· Set ARDisplay.Items equal to your new collection
As Gart is Silverlight based all the nessasary code is built into the UserControl provided by GART, so implementing it purely comes down to adding the control to your page:
<Grid x:Name="LayoutRoot"> <ARControls:ARDisplay x:Name="ARDisplay" d:LayoutOverrides="Width"> <ARControls:VideoPreview x:Name="VideoPreview" /> <ARControls:OverheadMap x:Name="OverheadMap" CredentialsProvider="{StaticResource BingCredentials}" /> <ARControls:WorldView x:Name="WorldView" /> <ARControls:HeadingIndicator x:Name="HeadingIndicator" HorizontalAlignment="Center" VerticalAlignment="Center" /> </ARControls:ARDisplay> </Grid>
And then initializing the control in your page “OnNavigatedTo” and stopping it in your “OnNavigatedFrom” methods of your page:
protected override void OnNavigatedFrom(System.Windows.Navigation.NavigationEventArgs e) { // Stop AR services ARDisplay.StopServices(); base.OnNavigatedFrom(e); } protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e) { // Start AR services ARDisplay.StartServices(); base.OnNavigatedTo(e); }
All that remains
is to source all the labels you want displayed on the viewfinder, the examples
provided with the framework allow you to roll your own or integrate it with
Bing.
Resources
The Motion API library reference:
http://msdn.microsoft.com/en-us/library/hh239189(v=VS.92).aspx
Bing Developer resources:
http://www.bing.com/toolbox/bingdeveloper/
Pinbucket case study:
Afterthoughts
Hopefully, this article shows you some of the possibilities with AR
from full blown AR solutions to a hybrid mix of capabilities that can extend your
current and future solutions. Additionally, this article is intended to show
that AR is not as daunting as things first appear.
Personally, I’d like to see more merged solutions where you blend
more of the real world in your applications or extend your application into the
real world.
I’ve seen projects with great potential. One such example is Kickstarter,
an app for people who run where the app/game tracks where you are and plans a
route for you. It sets target packages or activities to do on a run. The runner
then collects points or artefacts that they can use at the end of their run to
manage a virtual base. Each person’s base competes with other players around
the web.
In my opinion, you could take almost any game, point it out of the
window or, with a quick web search, drag the real world in and change the
experience for the player and make it different every time. You could use QR
codes from everyday objects or even set up an Easter egg hunt using an
app. Go wild, be fun and creative,
and don’t settle—this is an augmented world after all (quick nod to deus-ex
there).
Opinions expressed by DZone contributors are their own.
Trending
-
Docker Compose vs. Kubernetes: The Top 4 Main Differences
-
Superior Stream Processing: Apache Flink's Impact on Data Lakehouse Architecture
-
Unlocking Game Development: A Review of ‘Learning C# By Developing Games With Unity'
-
Mastering Time Series Analysis: Techniques, Models, and Strategies
Comments