CVPixelBuffer data layout in iOS 12.1 ARKit

bug notes: ...frame.capturedImage in iOS 12.1 started to show green lines and wrong video frame.
Fix was to take real CVPixelBufferGetBytesPerRowOfPlane instead ofCVPixelBufferGetWidth
and stretch image with UV coords  based on the difference between the two.
https://lists.freedesktop.org/archives/gstreamer-devel/2012-November/037921.html








Both the stride (bytes of padding added to each row), as well as the
extended rows (rows of padding at the bottom or top of the buffer) are
important.
The extended rows are what changed in iOS6; you can find out how many rows
of padding are added to the buffer using:
CVPixelBufferGetExtendedPixels(pixelBuffer, &columnsLeft, &columnsRight,
&rowsTop, &rowsBottom)
In the example you gave (Medium preset on iOS6) you should be seeingrowsBottom = 8.

The stride is effectively CVPixelBufferGetBytesPerRowOfPlane() and includes
padding (if any).
When no padding is present CVPixelBufferGetBytesPerRowOfPlane() will be
equal to CVPixelBufferGetWidth(), otherwise it'll be greater.

 

Advertisements

Towards spatial interactive

at myVR I have been working last months  to bring into the life ARKit and ARCore prototypes/apps using mMap SDK.  At HxGN Live 2018 they have been presented as part of the Hexagon’s Xalt.  You can see the demos  on the page here: http://www.myvr-software.com/xrpocs/

Looking back, one of the best decision for me was to go native in 2011 for iOS app “Spatial Reader”  to make mobile apps. It was not because iOS was more difficult and challenging, rather potential of the device as whole  that can be exploited only by going deep into the platform. This tight integration Apple is pursuing  is about interactivity, user experience and simplicity, and  in this ARKit case – literally ‘Spatial Interactivity’.

For HxGN live we have also  used anamorphic image to represent real excavation – the type of image and projection that make perfect 3D illusion from certain point of view.

MapKit with ARKit and overlays

Flyover mode in Apple Maps allows  AR/VR style interaction.  This is not by default available for iOS developers using underlaying MapKit/ARKit technology.  However it is possible to test it and the following short video is about this proof of concept – viewing cadastral maps (iKatastr)  in VR like experience on iPad .  Btw. Flyover mode on iOS 11  has some strange handling of overlays – described here so loading of tiles is little bit tricky. The iOS 10 version was much more better (check the video here)

 

GEO Visual GPU Analytics notes

With some delay, but before the year ends, I have to wrap up my presentation from GIS Hackathon March/2017 in Brno called Geo Visual GPU Analytics . It is available here in CZ : https://www.slideshare.net/sumbera/geo-vizualni-gpu-analytika  .  There are more pictures than text, so here I will try to add some comments to the slides.

slide 3,4: credits to my source of inspiration -Victor Bret, Oblivion GFX, Nick Qi Zhu.

slide 5: this is a snippet from my “journey log” (working diary), I keep every working day a short memo what I did, or anything significant that happen. It serves to several purposes, for example in this case I have gave up on trying WebGL , spent one /two days on other subject and then returned to the problem – and viola, I could resolve the problem.  Everyday counts, it helps to keep discipline and learn from past entries. Getting to know WebGL opened really ‘New Horizons” of GPU computing universe.

slide 7: “better bird in the hand than a pigeon on the roof  ” (English equivalent is : A bird in the hand is worth two in the bush’ ). This proverb is put into the context of edge vs cloud computing on slide 9.  In the hands – this is the edge , in the roof – this is the cloud.  So I believe that what users can hold in their hand, or wear or experience ‘nearby’ ‘is better’ (or more exciting)  than what exist somewhere far away (despite its better parameters).

slide 8 : We have same term for tool and instrument in the Czech – ‘nastroj’  so the question is musical instrument or just instrument (aka tool)? This goes to the whole topic of latency in user interaction, described for instance here. I tend to compare the right approach with musical instrument where tight feedback loop happens between the player and the musical instrument. The instrument must respond in less then 10 ms to tighten the feedback loop so the player can feel this instrument as his own ‘body’ and forget on ‘mechanics’ rather flow on the expressiveness of the feelings for what he is interpreting or improvising.  (right picture credit here) Why not to have such tools in visual analytics ? Why we need to wait for response from the server if the same task can be done quite  well on the edge ? mGL library for GPU powered visualization on web  or ImpactIN for iOS using Apple Pencil  reflects this principle. We have real-time rendering, we need human-sense-time interaction and bloated abstraction of current software stack do not help here despite of the advance in the hardware –  nice write up about latency problem here   …and as a side note there are computers types with very low latency – check any synthesizer or digital instrument where latency from user interaction must be very low, hence the left picture  on that slide represents them (combination of MIDI pad + Guitar).

Here is a short video form the Korg Monologue synth  on something used from 70’s , I consider this type of low-latency feedback-loop applied to new domains fascinating subject to explore. Notice real-time filter modification.

slide 9,10: nice chart from 2012 from britesnow.com    on cyclic nature of server vs client processing.  I stated there that Innovation happens on client (on edge) as servers(clouds, frames)  can do always anything and everything. Exaggerated and related to the slide 7 described above.  Workstations, PC, Smartphones (1st iPhone), AR/VR devices, wearables in general etc… it is always about efficiency in used space. Interestingly NVIDIA GPU Gems states similar on chip level.

slide 11: GPU chart over-performing CPU in conjunction with video resolution.

slide 12: Most tricky slide called ironically “Find 10 differences”. On left side is the program I did in 1993, in DOS, on right  the one I did using WebGL in 2016. Both examples are great achievements, the right side does GPU-based filtering (or marketingly in-memory)  with low user latency so it redraws immediately as user filters by his mouse pointing on brush selector.  The left was created in DOS era where each graphics card has its own way of mode switching  and that app could utilize maximum of the graphic card using 640×480 resolution with 256 colors ! that was something that time. However something is wrong in trying to find 10 differences as they are basically so similar, both using monitor, keyboard/mouse, and layout….

slide 13:  last slide titled “Find 1 difference”is the answer on the dilemma from slide 12  – the AR experience, new way of interaction, new type of the device for new workflows, visual analytic, exploration etc.  For one example of many possibilities of AR, here is a nice video from HxGN live 2017: