Category Archives: AR

GIS Visions 2045

Spatial Interactive

Stanislav Sumbera, GIS Vision 2024, 5.3. 2025,GIS Ostrava 2025

What happens here is more important than what happens now.
Space is naturally interactive, enabling collaboration and sharing.
The computer is not behind a 2D glass screen but understands 3D space and interactions within it.
People learn through observation, collaboration, and play.
Community Computer
Projective Augmented Reality (Projective AR)
Bret Victor – Dynamicland

image sources:https://gislab.utk.edu/tag/ar-sandbox/ , Dyamicland.org, Lightform

HMD / Head-Mounted Displays – Apple Vision Pro
Spatial Computing
Control through advanced gestures
“Super persistence” of objects – digital objects remain anchored as if truly part of the physical world
Pseudo-haptic feedback – realism in rendering creates the illusion of tactile response
Currently at the “UNIX Workstation” stage of the 1980s – showcasing possibilities that will later become accessible to everyone.
Also bloged here

2. Web, Open Source, and Technology Accessibility

From Google Maps → OpenLayers → Leaflet → MapBoxGL → MapLibreGL → ?
Each step represents greater availability, democratization, and accessibility of mapping technology, pushing development forward.
How difficult was it to render an image in 1993? How difficult was it to share that image with others? And today?
What is difficult, expensive, yet possible today that will become commonplace in 20+ years?

3. Lifespan of Data vs. Lifespan of Technology

WMS – Simple for visualization
Vector tiles – More complex to render (OGC API Tiles, MapBox Tiles, MapLibre Tiles – MLT)
3D tiles – OGC 3D Tiles, evolving standards for spatial data
More aesthetics, smoothness, and artistic expression in maps
Real-time rendering techniques, such as Gaussian Splat, for next-generation visualization

from Book: Eneterpise SOA by Krafzig, Banke, Slama

4. Scanning Spaces and Objects

3D scanning is accessible to everyone
Spatial Video, Spatial Photo
3D scanning is as simple as taking a photo
Photorealistic scanning

5.Precise Geolocation ~2-10 cm

VPS (Visual Positioning System) – accuracy < 10 cm
5G geolocation
Affordable high-precision GNSS + RTK/PPP (< 10 cm)
Accessible VPS from panoramic images, Mapy.cz?

6. AI – Welcome to the Jungle

NPCs have become “thinking machines” (are we, on other side, turning into NPCs ourselves? aka Jumanji 2 )

Image from Jumanji 2,driver – Mason Pike ?

The Chinese Room paradox – an English speaker perfectly assembles answers in Chinese following instructions without understanding the Chinese language and symbols meaning.
AI cannot create true originality but excels at combining and compiling existing inputs – a “super plagiarist” or “super puzzle resolver” ?
Might replace a significant amount of human (intellectual + routine) labor – in GIS (georeferencing, recognition/classification), programming/syntax, and more
“Hard work for machines, thinking for people” (Tomáš Baťa) is evolving into “(Pre)thinking* for machines, creativity/ideas for people” (in Czech Language : pre-mýšlení)
AI model marketplace – grow (cultivate) your unique “thought twin” that integrates into an open AI network.
Developer Twin: Blog post

Apple Vision Pro: Enabling Spatial Interaction

Leave a reply

We have various terms for AR – like Mixed Reality, Metaverse, Spatial-ware, XR. Among these, Apple’s term “Spatial Computing” stands out for its emphasis on integrating physical space and digital interactivity. This resonates with me, as the concept of “Spatial” reflects how we model and interact with space in meaningful ways. Years ago, I made up a new term “SpatialIn”—an open-ended label where “In” simply means Spatial is “in.” Later, with advancements like ARKit, I extended this idea into “Spatial Interactive,” emphasizing the interactive potential of space around us behaving like a dynamic canvas. Vision Pro aligns perfectly with this vision. After testing Vision Pro , here are my key observations:

Hands-Free Interaction

Vision Pro’s hands-free interaction feels intuitive. Manipulating virtual objects with gestures or gaze eliminates barriers and enhances usability. Fluent hand movements remind me of my Tai-Chi classes from years ago.

From VR to True Mixed Reality Immersion

Unlike VR, Vision Pro allows safer navigation in real spaces while engaging with virtual elements. It maintains spatial awareness and visual contact with reality, making it both practical and immersive.

Unmatched Persistence

Vision Pro’s capability to retain virtual object placement across sessions is impressive. This feature is critical for practical applications such as architectural design, where models need to stay precisely where placed for accurate spatial referencing, or in education, where persistent virtual setups can create consistent and engaging learning environments. I found a model I placed earlier in the day still standing on the lower floor of the building—exactly where I left it. This happens without explicit relocalization notification for the user, as the virtual model sticks to the physical space even across different floors. A must for advanced spatial computing design as virtual space must keep integrity similar to physical one.

Feeling Rendering on My Hands – sort of

The wide field of view and detailed lighting ensure a natural integration of digital and physical environments. Soft shadows and consistency make virtual elements feel tangible. Fidelity is so high that it creates the illusion of tactile sensation when interacting with virtual models. While purely subjective, this visual illusion convincingly engages my sense of touch, making the experience feel remarkably real to me. This visual-feel integration adds a layer of immersion that goes beyond sight and sound, engaging the sense of presence in a way that feels almost instinctive.

Here is a little example – while using only flat browser with WebGL powered MapLibreGL there: Update 07/2025 -live version here, e.g. : https://ikatastr.cz/3d/#kde=50.741216,15.001038,15.4,50,75&mapa=letecka&vrstvy=budovy3d,peaks,zsj,ku,obce,parcelybudovy&info=50.732718,14.984554

As there is no 3D map from Apple to test on the device by default, I had to convert a gltf to USDZ and send it to device for QuickLook to get experience of how 3D city would look like there.Here it goes:

CVPixelBuffer data layout in iOS 12.1 ARKit

bug notes: ...frame.capturedImage in iOS 12.1 started to show green lines and wrong video frame.
Fix was to take real CVPixelBufferGetBytesPerRowOfPlane instead ofCVPixelBufferGetWidth
and stretch image with UV coords  based on the difference between the two.
https://lists.freedesktop.org/archives/gstreamer-devel/2012-November/037921.html










Both the stride (bytes of padding added to each row), as well as the
extended rows (rows of padding at the bottom or top of the buffer) are
important.
The extended rows are what changed in iOS6; you can find out how many rows
of padding are added to the buffer using:
CVPixelBufferGetExtendedPixels(pixelBuffer, &columnsLeft, &columnsRight,
&rowsTop, &rowsBottom)

In the example you gave (Medium preset on iOS6) you should be seeingrowsBottom = 8.

The stride is effectively CVPixelBufferGetBytesPerRowOfPlane() and includes
padding (if any).
When no padding is present CVPixelBufferGetBytesPerRowOfPlane() will be
equal to CVPixelBufferGetWidth(), otherwise it'll be greater.

Towards spatial interactive

2 Replies

at myVR I have been working last months to bring into the life ARKit and ARCore prototypes/apps using mMap SDK. At HxGN Live 2018 they have been presented as part of the Hexagon’s Xalt. You can see the demos on the page here: http://www.m yvr-software.com/xrpocs/

Looking back, one of the best decision for me was to go native in 2011 for iOS app “Spatial Reader” to make mobile apps. It was not because iOS was more difficult and challenging, rather potential of the device as whole that can be exploited only by going deep into the platform. This tight integration Apple is pursuing is about interactivity, user experience and simplicity, and in this ARKit case – literally ‘Spatial Interactivity’.

For HxGN live we have also used anamorphic image to represent real excavation – the type of image and projection that make perfect 3D illusion from certain point of view.

MapKit with ARKit and overlays

Flyover mode in Apple Maps allows AR/VR style interaction. This is not by default available for iOS developers using underlaying MapKit/ARKit technology. However it is possible to test it and the following short video is about this proof of concept – viewing cadastral maps (iKatastr) in VR like experience on iPad . Btw. Flyover mode on iOS 11 has some strange handling of overlays – described here so loading of tiles is little bit tricky. The iOS 10 version was much more better (check the video here)

GEO Visual GPU Analytics notes

Update June 2019: idea on sequencing and replay mentioned here: https://blog.sumbera.com/2019/06/28/motion-sequence-and-replay-in-dynamic-maps/

****

With some delay, but before the year ends, I have to wrap up my presentation from GIS Hackathon March/2017 in Brno called Geo Visual GPU Analytics . It is available here in CZ : https://www.slideshare.net/sumbera/geo-vizualni-gpu-analytika . There are more pictures than text, so here I will try to add some comments to the slides.

Geo vizualni gpu analytika from Stanislav Sumbera

slide 3,4: credits to my source of inspiration -Victor Bret, Oblivion GFX, Nick Qi Zhu.

slide 5: this is a snippet from my “journey log” (working diary), I keep every working day a short memo what I did, or anything significant that happen. It serves to several purposes, for example in this case I have gave up on trying WebGL , spent one /two days on other subject and then returned to the problem – and viola, I could resolve the problem. Everyday counts, it helps to keep discipline and learn from past entries. Getting to know WebGL opened really ‘New Horizons” of GPU computing universe.

slide 7: “better bird in the hand than a pigeon on the roof ” (English equivalent is : A bird in the hand is worth two in the bush’ ). This proverb is put into the context of edge vs cloud computing on slide 9. In the hands – this is the edge , in the roof – this is the cloud. So I believe that what users can hold in their hand, or wear or experience ‘nearby’ ‘is better’ (or more exciting) than what exist somewhere far away (despite its better parameters).

slide 8 : We have same term for tool and instrument in the Czech – ‘nastroj’ so the question is musical instrument or just instrument (aka tool)? This goes to the whole topic of latency in user interaction, described for instance here. I tend to compare the right approach with musical instrument where tight feedback loop happens between the player and the musical instrument. The instrument must respond in less then 10 ms to tighten the feedback loop so the player can feel this instrument as his own ‘body’ and forget on ‘mechanics’ rather flow on the expressiveness of the feelings for what he is interpreting or improvising. (right picture credit here) Why not to have such tools in visual analytics ? Why we need to wait for response from the server if the same task can be done quite well on the edge ? mGL library for GPU powered visualization on web or ImpactIN for iOS using Apple Pencil reflects this principle. We have real-time rendering, we need human-sense-time interaction and bloated abstraction of current software stack do not help here despite of the advance in the hardware – nice write up about latency problem here …and as a side note there are computers types with very low latency – check any synthesizer or digital instrument where latency from user interaction must be very low, hence the left picture on that slide represents them (combination of MIDI pad + Guitar).

Here is a short video form the Korg Monologue synth on something used from 70’s , I consider this type of low-latency feedback-loop applied to new domains fascinating subject to explore. Notice real-time filter modification.

slide 9,10: nice chart from 2012 from britesnow.com on cyclic nature of server vs client processing. I stated there that Innovation happens on client (on edge) as servers(clouds, frames) can do always anything and everything. Exaggerated and related to the slide 7 described above. Workstations, PC, Smartphones (1st iPhone), AR/VR devices, wearables in general etc… it is always about efficiency in used space. Interestingly NVIDIA GPU Gems states similar on chip level.

slide 11: GPU chart over-performing CPU in conjunction with video resolution.

slide 12: Most tricky slide called ironically “Find 10 differences”. On left side is the program I did in 1993, in DOS, on right the one I did using WebGL in 2016. Both examples are great achievements, the right side does GPU-based filtering (or marketingly in-memory) with low user latency so it redraws immediately as user filters by his mouse pointing on brush selector. The left was created in DOS era where each graphics card has its own way of mode switching and that app could utilize maximum of the graphic card using 640×480 resolution with 256 colors ! that was something that time. However something is wrong in trying to find 10 differences as they are basically so similar, both using monitor, keyboard/mouse, and layout….

slide 13: last slide titled “Find 1 difference”is the answer on the dilemma from slide 12 – the AR experience, new way of interaction, new type of the device for new workflows, visual analytic, exploration etc. For one example of many possibilities of AR, here is a nice video from HxGN live 2017: