@Akisamb

Akisamb@programming.dev · 1 month ago

Now instead of just querying the goddamn database, a one line fucking SQL statement, I have to deal with the user team

Exactly, you understand very well the purpose of microservices. You can submit a patch if you need that feature now.

Funnily enough I’m the technical lead of the team that handles the user service in an insurance company.

Due to direct access to our data without consulting us, we’re getting legal issues as people were using addresses to guess where people lived instead of using our endpoints.

I guess some people really hate the validation that service layers have.

Akisamb@programming.dev · 3 months ago

I’m afraid that would not be sufficient.

These instructions are a small part of what makes a model answer like it does. Much more important is the training data. If you want to make a racist model, training it on racist text is sufficient.

Great care is put in the training data of these models by AI companies, to ensure that their biases are socially acceptable. If you train an LLM on the internet without care, a user will easily be able to prompt them into saying racist text.

Gab is forced to use this prompt because they’re unable to train a model, but as other comments show it’s pretty weak way to force a bias.

The ideal solution for transparency would be public sharing of the training data.

Akisamb@programming.dev · 3 months ago

Hamas claims 6000 of their militants were killed.

https://www.wionews.com/world/hamas-official-says-over-6000-fighters-killed-during-war-in-gaza-691701/amp

Akisamb@programming.dev · 3 months ago

people who’ve never been laid

That was unnecessary. I know that people with poor social skills have more trouble with romance, but implying that all virgins are socially inept is a harmful stereotype, luck is a big factor in finding relationships.

Akisamb@programming.dev · 3 months ago

It’s absolutely amazing, but it is also literally and technologically impossible for that to spontaneously coelesce into reason/logic/sentience.

This is not true. If you train these models on game of Othello, they’ll keep a state of the world internally and use that to predict the next move played (1). To execute addition and multiplication they are executing an algorithm on which they were not explicitly trained (although the gpt family is surprisingly bad at it, due to a badly designed tokenizer).

These models are still pretty bad at most reasoning tasks. But training on predicting the next word is a perfectly valid strategy, after all the best way to predict what comes after the “=” in 1432 + 212 = is to do the addition.

Akisamb@programming.dev · 3 months ago

More than 33,000 Palestinians have been killed in Israel’s offensive, around two-thirds of them women and children, according to Gaza’s Health Ministry. Its count doesn’t distinguish between civilians and combatants.

In the 33 000 figure Hamas combatants are included.

I’d say at least 20000 innocent civilians killed since the start of the conflict. Probably more as Israel seems to be quite trigger happy on civilians.

Akisamb@programming.dev · 3 months ago

Now let’s look at Office. Open an Excel spreadsheet with tables in any app other than excel. Tables are something that’s just a given in excel, takes 10 seconds to setup, and you get automatic sorting and filtering, with near-zero effort. No, I’m not setting up a DB in an open-source competitor to Access. That’s just too much effort for simple sorting and filtering tasks, and isn’t realistically shareable with other people.

Am I missing something or isn’t it exactly the same thing in libre office ?

Akisamb@programming.dev · 3 months ago

I don’t believe that there are solutions that are as complete as team, for video and voice calls it’s among the best.

But it’s so bad for text ! Why do I have to wait for a second when I change channels ? Why does it not support markdown (the partial implementation that it has is arguably worse than no implementation at all) ? Why is the search so bad ?

Akisamb@programming.dev · 3 months ago

This is not true in France. Politicians that have proven fraud are arrested and charged. In France we have Sarkozy, Cahuzac, Fillon that were all charged with crimes.

They were president, minister and presidential candidate respectively. I’d be surprised if it was different in the USA. I’m seeing that trump is also being charged, the system seems to be working.

Akisamb@programming.dev · 4 months ago

Convolutional neural networks and plant identifying apps came before chat gpt. Beyond both relying on neural networks they don’t have much in common.

Akisamb@programming.dev · 4 months ago

Don’t know why you are down voted it’s a good question.

As a matter of fact it almost happened for search engines in France. Newspaper’s argued that snippets were leading people to not go into their ad infested sites thus losing them revenue.

https://techcrunch.com/2020/04/09/frances-competition-watchdog-orders-google-to-pay-for-news-reuse/

Akisamb@programming.dev · 5 months ago

They gave them a birth control shot without properly informing them of what it was. Still scandalous, but not what you are saying.

Akisamb@programming.dev · 6 months ago

Yes to your question, but that’s not what I was saying.

Here is one of the most popular training datasets : https://pile.eleuther.ai/

If you look at the pdf describing the dataset, you’ll find the mean length of these documents to be somewhat short with mean length being less than 20kb (20000 characters) for most documents.

You are asking for a model to retain a memory for the whole duration of a discussion, which can be very long. If I chat for one hour I’ll type approximately 8400 words, or around 42KB. Longer than most documents in the training set. If I chat for 20 hours, It’ll be longer than almost all the documents in the training set. The model needs to learn how to extract information from a long context and it can’t do that well if the documents on which it trained are short.

You are also right that during training the text is cut off. A value I often see is 2k to 8k tokens. This is arbitrary, some models are trained with a cut off of 200k tokens. You can use models on context lengths longer than that what they were trained on (with some caveats) but performance falls of badly.

Akisamb@programming.dev · 6 months ago

There are two issues with large prompts. One is linked to the current language technology, were the computation time and memory usage scale badly with prompt size. This is being solved by projects such as RWKV or mamba, but these remain unproven at large sizes (more than 100 billion parameters). Somebody will have to spend some millions to train one.

The other issue will probably be harder to solve. There is less high quality long context training data. Most datasets were created for small context models.

Akisamb@programming.dev · 6 months ago

To avoid people being homeless ?

Akisamb@programming.dev · 7 months ago

I think it’s healthy to have clear boundaries with coworkers, they are not the same things as friends.

That said I spend 41 hours a week working, no way I’m not going to socialise with my coworkers. If I don’t make any friends after several years of working at a place I feel I have done something wrong.

Akisamb@programming.dev · edit-2 7 months ago

As long as the demographic chart of Palestinians murdered by the IDF looks like the actual Palestinian population demographic (1/3 women, 1/3 kids) it’s safe to assume that there is absolutely no real targeting taking place.

Yes, there is a bump if you look at the Hamas fighting population demographics but it is a minority. The large majority of people killed in this war are civilians there is no doubt about that. I was denying the 1:100 figure. For example Hamas has 1\3 of female victims, yet have a 1:4 casualty rate.

Netanyahu literally said publicly that he saw wants to kill all Palestinians including the women and children and his deeds match his words.

No he didn’t and you know it. Why lie ?

Some senior Hamas executives have had such a discourse for Jews before being very softly reprimanded by Hamas but no executive from the Israeli government. There have been plenty of dog whistles, but they are not stupid enough to say it literally.

Edit : I didn’t realize it but you were the person calling for the massacre of civilians in an earlier comment. Explains why you would lie, you need to dehumanise your enemy. I’m not spending more energy on this. You’re too far gone.

Akisamb@programming.dev · 7 months ago

Frankly both are awful and both should not be allowed to take control of Israel/Palestine. I have no idea what solution there is to this conflict honestly, I just want things to stay somewhat factual.

I agree that the 5000 figure seems highly improbable. Israel has been quite effective at killing high members of Hamas but I doubt they have killed 5000 out of the 30000 militants.

Akisamb@programming.dev · 7 months ago

Where did I say that one side didn’t want to genocide the other ? Hamas is more public about it and won’t even try to justify their civilian killings, but Netanhyu government has made it clear again and again that they are willing to do collective punishment. The high civilian death rate is of course intentional.

Hamas has also killed plenty of civilians, and they don’t even try to pretend that it was accidental. That said you are close to their ratio which is three civilians for every military death.

Israel’s civilian deaths to militant deaths is probably higher due to the usage of bombs (10 civilian deaths per explosion) and intentional starvation but it isn’t 100:1.

Hamas’ strategy of hiding behind civilians is also a war crime since it obviously increases the number of civilians killed.

If you believe Israeli propaganda, they have killed 5000 Hamas militants. Reality is probably smaller than that, but since Hamas intentionally doesn’t publish their militant casualties we won’t have a good estimation. That said 500 Israeli soldiers have died and seeing the asymmetry in warfare, you can expect much more Hamas militants to have died. I have not been able to find an estimate from an independent source.

Akisamb@programming.dev · 7 months ago

Yes ? Do you really think only 200 Hamas militants were killed ? Because that’s what your ratio would suggest.

Israel is unnecessarily killing and starving civilians, but once again gross misinformation serves nobody and only justifies more horrors.