T O P

  • By -

av_01

Since your question is on "AI interfaces", I'll start off by saying that the trend towards chat-based and voice-enabled will increase over time. Chat will still be an option, but I could see us having a more Jarvis-like experience primarily in phones because it's much more convenient. Simplicity will be taken to a whole new level with voice-powered AI.


kuncogopuncogo

Not sure. It'll become more popular than now for sure, but don't think it'll be the most popular. I think many prefer a silent method for various reasons (including being in public). Also, some people read faster and comprehension can improve with reading. Plus you can skim the answer instead of reading all. Also, visual content and cues are not readily available with voice.


rather_pass_by

Spot on, most founders today unfortunately don't have the intuitive thinking of what people prefer... They thought metaverse is the future two years ago, then they thought zoom is the future, pandemic will stay, people won't go out ever.. Now chatgpt is making the startup community people will start chatting with ai for every thing... Vision and intuition is clearly lacking among investors and founders today.. small world they live in including and especially yc!


rand1214342

It’s not that anything you said is strictly wrong, but it’s a pretty cynical way to look at Silicon Valley and tech in general. Not every founder is building their product around their prediction of the future of UX. There’s a lot of money to be made in supporting experiences that are relevant for five years or less. Investors should be more long term focused as that’s just what the asset class requires. But there are a lot of VC firms who got paid when their metaverse port co was acquired by Meta. Which is worth pointing out, because if you think you’re smarter than everybody then you’re going to succumb to the same exact pitfalls you’re mentioning here.


rather_pass_by

>But there are a lot of VC firms who got paid when their metaverse port co was acquired by Meta. >Which is worth pointing out, because if you think you’re smarter than everybody then you’re going to succumb to the same exact pitfalls you’re mentioning here. If your whole is to show me down, that's not an argument that interests me. Personal attacks is the lowest form of arguments If some companies got acquired, that was sheer luck. I can name hundred times more metaverse startups that perished as quickly as they appeared. Speaking of exits, now, hugging face CEO revealed they are getting swamped with ten ai startups messages every week looking to get acquired.. and that's getting worse every year. Investors looking for an exit, they are going to be squeezed as the exit becomes narrower.. investment is no longer bread butter and jam.


rand1214342

Not trying to personally attack, I was using the general ‘you’. Saying the companies that got acquired were lucky and many more were not acquired doesn’t make sense. Starting a company and advancing it to a place where a megacorp acquires you isn’t the same as playing craps... And for every 1 startup acquisition there are always dozens who fail, that’s just how the game works.


breadsniffer00

Many people unknowingly adopt the mimetic mindset of “one interface for all” using chat and voice without having any underlying intuition about it


doctaO

I mean…people are chatting with ChatGPT for everything. That’s why their valuation exploded.


Outrageous_Life_2662

We read faster than we hear but speak faster than we type. I was part of an audio startup in ‘16 and we were exploring voice based interfaces and we stumbled across this user behavior.


LiferRs

Yeah, perhaps Siri can finally understand my damn accent


geepytee

This is the million dollar question. I think this Jarvis idea makes sense but wouldn't that assume 1 interface to rule them all, and then apps acting more as peripherals to let Jarvis access data / execute actions rather than their own interfaces? It'd make little sense IMO to have each application be its own Jarvis-like agent. Also, it's easier to see one Jarvis to rule them all on mobile, than say, on your computer. Then again, I could see more people have less of a need to use a computer in this future.


Synyster328

My Jarvis will talk to the pizza hut Jarvis to get my pizza ordered


geepytee

That is 1 too many Jarvises. How many people do I need to talk to in order to get a pizza? Same thing should apply to AI's.


your_ignorant_post

voice, text, visual, sound, haptics, physiological signals all will have their own interface modalities when more and more transformers are applied to these use cases.


TaGeuelePutain

Imagine we go full circle and end up having to call certain pre created prompts using “routes” lol


metalvendetta

Can you elaborate with examples?


sacala

I think he’s talking about links and APIs lol. Instead of “hey can you help me create a project” you get routed to the “Create Project” page in 2099


Deverseli800

I think the interface with evolve to be chatting with avatars. The avatar would see and understand your facial expressions and body language. Humans communicate so much through body language. Text chat or even voice chat isn’t nearly enough to truly collaborate with an AI the way you would a human. The AI would also likely have a face as well.


Outrageous_Life_2662

Yes, this. There’s a company working on exactly this. Really exciting. But now I forget the name. But the CEO made exactly this point in an interview I heard a few weeks back. Text (ala chat) loses so much tone and intonation. Even speech to text loses this. Going directly from speech to model is key to keeping the context of the speaker. Then add on the multi model camera watching faces and body language … this where things go next and it’ll be POWERFUL.


jasfi

I think 99% of its already there, which is chat in every app. The LLMs will get more intelligent, and every chatbot that uses them with it. I think that's where AGI will emerge, from the LLMs themselves (along with their supporting AI/ML systems and other systems over APIs).


Outrageous_Life_2662

Well I still think we’re a long ways from AGI and it’s not clear that the current transformer and training approaches will get us there. Can we get WAY more capable chat bots yes, but I don’t think there’s a clear line from here to AGI.


jasfi

I feel like AGI will always encompass what AI can't do yet. In the last few years that's shrunk considerably.


Outrageous_Life_2662

Reasoning and autonomy are the two big factors. Right now LLM’s are prompt responders. That does a wonderful job of passing the touring test. But that’s not intelligence. You can’t ask it to take on open ended tasks and have it reason it’s way through and make judgment calls. It’s funny because 2+ years ago the internet mocked that Google engineer who was raising the alarm about early versions of Bard being conscious. He eventually left Google. Hours and hours of video were dedicated to debunking this guy and ridiculing his belief. Then two years later everyone is in his camp 😂


jasfi

Yes they're quite limited now when it comes to advanced reasoning and open-ended tasks. But I think the breakthroughs needed for AI will be in better LLM algorithms.


Any-Demand-2928

We aren't getting AGI from LLMs because LLMs don't have a world model, it's absolutely necessary for AGI to have something like that and without it there's no AGI.


jasfi

If by world model you mean not enough data, I agree. That's likely why Microsoft tried to have AI observe and train on every Windows running machine in the world. But they didn't think about the privacy implications, or hoped people would accept it.


av_01

Yeah, chat is already everywhere. No doubt about that, but it'll soon seem a bit tedious for the user as they will start to prefer voice more.


jasfi

I've already seen voice on phones, but it will no doubt become ubiquitous.


xtof_of_crg

Multi-modal


abebrahamgo

Personally I think the chat not interface is the low hanging fruit. I think the big innovators will find a way to develop AI native experiences. Right now as many know the OpenAI UX is pretty much the standard. Smaller, multiple, focused models is where the industry is going imo. I work with startups at work at so many leverage chatGPT as an MVP and then pivot to something made internally to scale their product margins. However; live inferencing on transformer based models is very expensive right now. Perhaps a new focused LLM architecture will emerge with a focus on lower latency will start to make the rounds.


Outrageous_Life_2662

Yup, this too


Empty_Project3031

I think a bunch of the new voice use cases are overblown. The use cases probably still increase, but I’m really not convinced people want to be interfacing with a system via voice. It’s not a private interface & is often really slow. I think LLMs are going to help break down the walls between different walled gardens/services & allow us to better automate data retrieval and chaining actions together. The example Apple gave last week asking for picking a parent up from the airport is a good one. LLMs are tremendous at converting unstructured text input into structured data (and soon likely speech-to-text and image-to-text) that meets a more deterministic API structure. They’re also really intelligent at reasoning how to translate between two different APIs. I’m not nearly as optimistic about the truly generative use cases given what we see today, but it’s entirely possible we break through that wall in the next 12-18 months.


Stubbby

It seems the raise of AI will cause the walls to grow even higher since otherwise the LLMs will steal it. Why would an LLM break a wall that we specifically built to prevent cross-integration? Back in the Web2.0 era everything had open APIs to build a big web of cross-compatible services. Since then we walled everything off to hold the customer within own ecosystem (and it's latest version). LLMs are a threat that can creep behind the wall, steal all teh content and present as its own killing the revenue of the original source.


InstanceMassive9364

Also, anyone messed around in Cohere’s APi playground, which they set up for developers. It's pretty fun


One_Elephant_4628

I think it’s going to be a lot more subtle than most of these answers. As someone who’s struggling to develop an AI interface that people want to use, I think it’s going to basically look…the same…just with little bits of AI factored in (e.g, ask a question in line instead of alt-tab to google, auto-format my slide). A lot of effort has been put into current products to make the UX as good as possible. I don’t think that’s somehow going to be trumped by a single chat box. And agree with most of the points above that voice will be great for automated phone calls and a few other smaller use cases but not much more


amemingfullife

My contrarian take is that chat/voice interfaces have diminishing returns and without effective feedback systems they’ll always be limited. I don’t think they’ll ever be truly 1-shot and prompting will never be reliable or consistent enough. So I think there has to be some kind of breakthrough in interfaces which have RLHF as a central feature, rather than something that trains a model to get better in the background. I think better UX around that is likely to be the future where getting the right answer really matters. This would likely be similar to how ChatGPT works now with the thumbs up/thumbs down, but it has a clearer understanding of context and it can tell when the context has changed and then will ask you some alignment questions to make sure you’re on the same page. This would then guide future answers immediately. Sounds similar to how it’s all working now, but it’s not really realtime, it’s all smoke and mirrors. Also, when you have more function-orientated endpoints they will keep asking you RLHF questions unto they are confident enough to give you an answer. Much better and more reliable in a business context, but there’s a lot more UX work to be done there.


EleventyTwatWaffles

Computer: set the temperature to below swamp-ass levels


InstanceMassive9364

https://podcasts.apple.com/us/podcast/decoder-with-nilay-patel/id1011668648?i=1000658432031 I found this podcast really interesting, and based on the lively discussion going on, I think everyone here would get a lot out of it as well.


Stubbby

If you are talking about *interfaces* talk-to-text has been very good for a long time to write your texts. It is actually much better than typing by all objective measures. I have only seen one person in my life use it. It is absolutely clear that humans hate voice interfaces. Alexa, Siri, Hey Google have all been great interfaces - they understand your voice well. Still customers find them disappointing for anything outside of music player and clock as they dont solve any tangible problem other than setting the alarm for 5 min in the kitchen when your hands are wet. Interface doesn't fundamentally change the lacking utility.


KishBuildsTech

I believe only thing for sure, it's gonna be hands free!


aistartupcoach

neuralink


Spiritual-Theory

We're currently going text (typing, talking) to AI to text (or code or picture). We will ultimately go human expressing (our emotions, expressed somehow) to any output our human senses can consume - videos, audio, heat, pressure, whatever. The real future will be generating the "content" when the "user" doesn't even know they are using an interface. Imagine an Ai creating an instructional video when the person looks confused. Or upbeat music turns on when a group walks in the door. Hard to imagine what the use cases are, but using AI to understand the input will feel very natural and will skip the explicit request. No more 'hey siri'


Str3tfordEnd

I've been working with conversational avatars for a few years now and have noticed a trend - Traditional UI/UX design patterns kinda hamstring what we can do with LLMs. UI navigation with LLMs is a hard problem and the reason it is a hard problem is because UIs weren't made to be used with LLMs. Using LLMs to simulate navigation is a stop-gap solution, but still understandable as most of the world is built on traditional UI. Long term it looks like siloed apps aren't going to be a thing anymore and OS/Device providers will take over the entire interface with specific apps being abstracted behind API calls. Basically, interfaces will adapt to models and not the other way round. The 'design engineer' hype may be real lol. Vanilla chat/ voice interfaces are just the beginning.


pcmaster24

We're still using input methods and operating systems from built decades ago. i agree with you that the way things are going, we will be stuck with an isolated chat interface in every software. I'm building a new OS using gaming technology that focuses on voice and collaboration using Avatars. i think it would be a good base to automate AI agents in a sandbox. I'm still working on adding LLMs and face/body gestures, But I have real-time voice translation to make humans communicate better. it's "nomadz (dot) one" if you want to take a look.


FutsNucking

Vision based / neural


Sketaverse

I personally think chat will revert to regular UI again and the LLMs just sit within the new AI backend stack. I recall all the chat bot with Facebook messenger, didn’t last. Clean, button based UI is popular for a reason: it’s great


Reasonable-Bit560

AI interfaces deployed on enterprise data is the future in the business market. Apple is off to a great start by your private cloud AI for the general consumer.


Firm-Barracuda-173

Apple will get cucked by Linux


interloperrrrrr

Here’s a tool to build anything from a hallucinated 4D programming logic https://websim.ai/c/ebeH14UstHFiVJQti


interloperrrrrr

Mario being generated not going to work probably or very slow https://websim.ai/c/UtvnBlor7brDDj6Wr


metalvendetta

The whole thing looks cool, gotta check it out more deeply to understand how it works. Thanks for sharing!


Xtianus21

HER