![](/static/253f0d9/assets/icons/icon-96x96.png)
![](https://lemmy.world/pictrs/image/8f2046ae-5d2e-495f-b467-f7b14ccb4152.png)
10 days without food hits differently when you are hiking through mountains 16 hours a day vs sitting on your couch
10 days without food hits differently when you are hiking through mountains 16 hours a day vs sitting on your couch
Literally every library with any traction in any field is MIT licensed.
If the scientific python stack was GPL, then industry would have just kept paying for Matlab licenses
For every 1 person who knows how to use the windows command line, there are 50 people struggling because they didn’t embed their video into their PowerPoint, or worse, their USB stick only contains a shortcut to their actual .ppt file
Especially because a 15% tip is almost twice as good as it was 10 years ago due to rising food costs
The feature is explicit sync, which is a brand new graphics stack API that would fix some issues with nvidia rendering under Wayland.
It’s not a big deal, canonical basically said ‘this isn’t a bug fix or security patch, it’s not getting backported into our LTS release’ - so if you want it you have to install GNOME/mutter from source, switch operating systems, or just wait a few months for the next Ubuntu release
GNOME said this update is a minor bug fix (point release)
Canonical said this is actually a major feature update, and doesn’t want to backport it into its LTS repositories
Reddit has way more data than you would have been exposed to via the API though - they can look at things like user ARN (is it coming from a datacenter), whether they were using a VPN, they track things like scroll position, cursor movements, read time before posting a comment, how long it takes to type that comment, etc.
no one at reddit is going to hunt these sophisticated bots because they inflate numbers
You are conflating “don’t care about bots” with “don’t care about showing bot generated content to users”. If the latter increases activity and engagement there is no reason to put a stop to it, however, when it comes to building predictive models, A/B testing, and other internal decisions they have a vested financial interest in making sure they are focusing on organic users - how humans interact with humans and/or bots is meaningful data, how bots interact with other bots is not
Not with 64gb ram and 16+ cores on that budget
To compare every comment on reddit to every other comment in reddit’s entire history would require an index
You think in Reddit’s 20 year history no one has thought of indexing comments for data science workloads? A cursory glance at their engineering blog indicates they perform much more computationally demanding tasks on comment data already for purposes of content filtering
you need to duplicate all of that data in a separate database and keep it in sync with your main database without affecting performance too much
Analytics workflows are never run on the production database, always on read replicas which are taken asynchronously and built from the transaction logs so as not to affect production database read/write performance
Programmers just do what they’re told. If the managers don’t care about something, the programmers won’t work on it.
Reddit’s entire monetization strategy is collecting user data and selling it to advertisers - It’s incredibly naive to think that they don’t have a vested interest in identifying organic engagement
Look at the picture above - this is trivially easy. We are talking about identifying repost bots, not seeing if users pass/fail the Turing test
If 99% of a user’s posts can be found elsewhere, word for word, with the same parent comment, you are looking at a repost bot
I know everyone here likes to circle jerk over “le Reddit so incompetent” but at the end of the day they are a (multi) billion dollar company and it’s willfully ignorant to infer that there isn’t a single engineer at the company who knows how to measure string similarity between two comment trees (hint: import difflib
in python)
If you have access to the entire Reddit comment corpus it’s trivial to see which users are only reposting carbon copies of content that appears elsewhere on the site
Reddit has access to its own data - they absolutely know which users are posting unique content and which user’s content is a 100% copy of data that exists elsewhere on their own platform
Reddit probably omits bot accounts when it sells its data to AI companies
The plaintiff(s) in a class action usually gets a pretty decent chunk - substantially more than the class members because they are the one’s doing all the work on the class’s behalf
The payout for class members depends on the number of people who sign up, which generally depends on the burden of proof. If you need to provide a receipt the payout is generally much higher because it gets split up fewer ways. I’ve gotten class action payouts as high as $300 when all I had to do was dig up through my bank records to find out the date of a transaction, and as low as $2, when all I had to do was click a link and enter my email address
They aren’t being made anymore - people are just reselling old hoarded stock
https://eyeondesign.aiga.org/we-spoke-with-the-last-person-standing-in-the-floppy-disk-business/
She has north of 3m in sponsorship deals right now, and we can only assume that number will go up in the WNBA
You could set it up in docker whilst still on windows, and then all you need to do is copy/paste your compose file onto your new Linux machine, that way you aren’t struggling to learn two things at the same time (alleviates the “I don’t know if the problem is with my docker config or my host OS”)
Fast food also varies so much across the US. I talked up Chick-fil-A so much after having it in Texas, and then when I brought my girlfriend to one in Florida it was garbage.
You can just point your domain at your local IP, e.g. 192.168.0.100