literatur.social ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Der Einstieg ins Fediverse für Autor:innen und Literaturmenschen ... The gateway into the fediverse for authors and all people interested in literature.

Verwaltet von:

Serverstatistik:

299
aktive Profile

#computervision

3 Beiträge3 Beteiligte0 Beiträge heute

Seeing @neauoire's paper computing explorations, I was reminded of an old idea I proposed at PaperCamp LDN back in 2009: Originally this was vaguely about using origami to create multi-purpose AR markers, but I then extended it to other use cases, some of which could be adapted and be even relevant today, e.g. as a form of proof-of-work, protection against AI crawlers or other form of access control.

Some possible approaches for different use cases:

1) Ask users to perform or series of simple folds and then check results by validating fold lines in the flat sheet paper via computer vision
2) Same as #1, but perform shape recognition/validation of fully folded result
3) Unfold a pre-shared (and pre-folded) origami object to check result by validating fold lines via computer vision
4) Provide instructions for multiple origami creations to create uniquely identifiable objects for computer vision based interactive environments

Number 1-3 are more or less about forms of proving physicality, work, membership. Number 4 is more about using origimi as fiducial markers for general interactions

More ideas/summary from that event:
adactio.com/journal/1546

cc/ @adactio

✍ 📰 𝗨𝗻𝘀𝗲𝗿 𝗝𝗼𝘂𝗿𝗻𝗮𝗹𝗶𝘀𝘁-𝗶𝗻-𝗥𝗲𝘀𝗶𝗱𝗲𝗻𝗰𝗲-𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺 𝗴𝗲𝗵𝘁 𝗶𝗻 𝗱𝗶𝗲 𝗻ä𝗰𝗵𝘀𝘁𝗲 𝗥𝘂𝗻𝗱𝗲

👩‍💻 Das Programm bietet Journalist:innen die Möglichkeit, in einem drei- bis sechsmonatigen, bezahlten Aufenthalt in Tübingen zu einem selbst gewählten Thema zu recherchieren.

🤖 Von uns gibt es Einführungen in Themen wie #MachineLearning, #ComputerVision und #Robotik

📧 Bitte bewerben Sie sich mit kurzem Ideenpapier, Anschreiben&Lebenslauf (DE o. ENG) 𝗯𝗶𝘀 𝟮𝟰.𝟬𝟰.𝟮𝟬𝟮𝟱
Bewerbungen bitte per E-Mail an janis.fischer@cyber-valley.de

To avoid a massive OpenCV dependency for a current project I'm involved in, I ended up porting my own homemade, naive optical flow code from 2008 and just released it as a new package. Originally this was written for a gestural UI system for Nokia retail stores (prior to the Microsoft takeover), the package readme contains another short video showing the flow field being utilized to rotate a 3D cube:

thi.ng/pixel-flow

I've also created a small new example project for testing with either webcam or videos:

demo.thi.ng/umbrella/optical-f

Just published my latest tech dive:
"What a Week: From Quantum Particles to Team Red"! 🚀

Been losing sleep over Veo 2's computer vision breakthroughs (finally affordable for indie devs!), Microsoft's hunt for Majorana fermions (quantum computing's holy grail?), and AMD's continued market disruption (though seriously, their marketing dept needs work 😅).
That video on AI consciousness sent me down a philosophical rabbit hole that I'm still navigating.
The line between advanced pattern recognition and "awareness" gets blurrier every day...
These seemingly different technologies are all connected pieces of the same revolution. Most exciting innovations happen at intersections!
Full thoughts here ➡️: smsk.dev/2025/02/27/what-a-wee

What tech keeps YOU up at night?

devsimsek's Blog · What a Week! From Quantum Particles to Team Red - devsimsek's Blog
Mehr von devsimsek

🌟 Exciting News! 🌟 Our group’s paper 𝗩²𝗗𝗶𝗮𝗹: 𝗨𝗻𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗩𝗶𝗱𝗲𝗼 𝗮𝗻𝗱 𝗩𝗶𝘀𝘂𝗮𝗹 𝗗𝗶𝗮𝗹𝗼𝗴 𝘃𝗶𝗮 𝗠𝘂𝗹𝘁𝗶𝗺𝗼𝗱𝗮𝗹 𝗘𝘅𝗽𝗲𝗿𝘁𝘀 has been accepted at CVPR 2025! 🎉📚

V²Dial is a novel model specifically designed to handle both image and video input data for multimodal conversational tasks. Extensive evaluations on AVSD and VisDial datasets show that V²Dial achieves new state-of-the-art results across multiple benchmarks.

Congratulations to the authors. 🙌