Artwork

A tartalmat a HackerNoon biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a HackerNoon vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.
Player FM - Podcast alkalmazás
Lépjen offline állapotba az Player FM alkalmazással!

AI Safety and Alignment: Could LLMs Be Penalized for Deepfakes and Misinformation?

8:10
 
Megosztás
 

Manage episode 430727965 series 3474148
A tartalmat a HackerNoon biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a HackerNoon vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

This story was originally published on HackerNoon at: https://hackernoon.com/ai-safety-and-alignment-could-llms-be-penalized-for-deepfakes-and-misinformation-ecabdwv.
Penalty-tuning for LLMs: Where they can be penalized for misuses or negative outputs, within their awareness, as another channel for AI safety and alignment.
Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #ai-safety, #ai-alignment, #agi, #superintelligence, #llms, #deepfakes, #misinformation, #hackernoon-top-story, and more.
This story was written by: @davidstephen. Learn more about this writer by checking @davidstephen's about page, and for more stories, please visit hackernoon.com.
A research area for AI safety and alignment could be to seek out how some memory or compute access of large language models [LLMs] might be briefly truncated, as a form of penalty for certain outputs or misuses, including biological threats. AI should not just be able to refuse an output, acting within guardrail, but slow the next response or shut down for that user, so that it is not penalized itself. LLMs have—large—language awareness and usage awareness, these could be channels to make it know, after pre-training that it could lose something, if it outputs deepfakes, misinformation, biological threats, or if it continues to allow a misuser try different prompts without shutting down or slowing against openness to a malicious intent. This could make it safer, since it would lose something and will know it has.

  continue reading

326 epizódok

Artwork
iconMegosztás
 
Manage episode 430727965 series 3474148
A tartalmat a HackerNoon biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a HackerNoon vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

This story was originally published on HackerNoon at: https://hackernoon.com/ai-safety-and-alignment-could-llms-be-penalized-for-deepfakes-and-misinformation-ecabdwv.
Penalty-tuning for LLMs: Where they can be penalized for misuses or negative outputs, within their awareness, as another channel for AI safety and alignment.
Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #ai-safety, #ai-alignment, #agi, #superintelligence, #llms, #deepfakes, #misinformation, #hackernoon-top-story, and more.
This story was written by: @davidstephen. Learn more about this writer by checking @davidstephen's about page, and for more stories, please visit hackernoon.com.
A research area for AI safety and alignment could be to seek out how some memory or compute access of large language models [LLMs] might be briefly truncated, as a form of penalty for certain outputs or misuses, including biological threats. AI should not just be able to refuse an output, acting within guardrail, but slow the next response or shut down for that user, so that it is not penalized itself. LLMs have—large—language awareness and usage awareness, these could be channels to make it know, after pre-training that it could lose something, if it outputs deepfakes, misinformation, biological threats, or if it continues to allow a misuser try different prompts without shutting down or slowing against openness to a malicious intent. This could make it safer, since it would lose something and will know it has.

  continue reading

326 epizódok

Kaikki jaksot

×
 
Loading …

Üdvözlünk a Player FM-nél!

A Player FM lejátszó az internetet böngészi a kiváló minőségű podcastok után, hogy ön élvezhesse azokat. Ez a legjobb podcast-alkalmazás, Androidon, iPhone-on és a weben is működik. Jelentkezzen be az feliratkozások szinkronizálásához az eszközök között.

 

Gyors referencia kézikönyv

Hallgassa ezt a műsort, miközben felfedezi
Lejátszás