111 Replies
https://discord.com/channels/436251713830125568/1355871422639112322
that's a bloody good hint, as the british would say
i forgot to paste the code
it's not just the code
what are you trying to do?
which problems are you finding?
any errors?
all that is part of the question
here my intention is to increment the strWordCount variable everytime " " key is pressed
that's not how word counting works
yes
im thinking to add && in the if condition
that still isn't how a word works
first, you have to decide on what is a word
what to do, rookie logic😔
what is a word?
yeah any letter
hello
<-- so, 5 words?no
so, it isn't "any letter"
once the user press " ", its one word
then
h i
is how many words?yeah thats the problem
im thinking to solve with the "&&" in if condition
you're counting spaces, not words
right
what do i do
don't count spaces
count words
but first, what is a word?
how do i count words
you define what is a word
yeah idk
they didnt mention it
there's the naïve "anything between spaces is a word", which you can do with
value.split(' ').length
but that's also not how words work
for example, -
would be a word
and no.nope
would be 1 word toooh yeah
now, you can decide to split by non-"word" characters, but
123456
would be 6 words
also, what about 👍 ?
that's an emoji - and it can have 1-2+ charactersi think this whole project is goofy
what about z̴͐͘a̷̋̽ľ̸̀g̸̍̀o̷͛͛?
is that trillions of words? just 1 with trillions of ligatures and accents and weirdness?
you need a definition on what is a word
wow
they didnt give it
what do i do
well, then you decide what's "good enough"
or just use someone else's code 🤣
should a beginner do this project
yes
there seems much better projects
but the problem is badly defined, if it doesn't say what counts as a word
right
i also have to count sentences
and they didnt mention its definition either
😔
but first, you need to know what is a word, because a sentence is a set of 1 or more words
yeah how do i decide
depends on what you consider a word
how is - a word
or +
is
abso-fucking-lutely
a word? or 3 words?no idea
then you have to decide
if i decide its 1 word then it will be problematic
natural language processing is the subject of many books
and basically it is what you need to do
but that's hard, so, you compromise and make something "good enough"
this is way too phylosophical a question for a word counter
just split the value of the textarea by " ", take the length, and call it a day
i know, but you need to split the text into "tokens"
that is absolutely valid, but he also needs to count sentences
which is very easy
it's just split by a blob or any of
. ? !
together, like no!?
or the end of the texti think i should count " - " as a word too
or ?
!
etc
i wouldn't count those as words
i would split by anything that isn't a letter or number
1 or more of those
oh yeah
for example
he, when (yesterday), cooked
the ),
has to be split without being counted as 3 words
or even as 1
but remember: you're aiming for "ok enough"yeah
if you're happy with just splitting by space and counting everything that isn't empty, that's fine too
as long as you define what is and isn't a word
yeah but it doesnt feel satisfying
that's because there's more nuance to it
but it's enough, most of the time
if i do that then " he , when ( yesterday ) , cooked " will be 8 words
wtf
😔
you're right
why is this project under junior category
you can always ignore the symbols
oh yeah
there's another good one: Count word boundaries with a regex and divide by 2 (because each word has two word boundaries, with a little bit of extra logic for an empty text box showing 0 instead of 1:
that's a pretty good way to do it
😵💫
because you're supposed to half-ass the counting. They effectively want you to count spaces, period/question mark/exclamation mark and call it a day
i wanna ignore symbols and stuff
they just want to get you out of the box, think for a little and implement something ok enough
what they're expecting of you is this:
the
\b
already does thator maybe
sentenceCount = text.split(/\.|\?|\!/).length
ohhh
maybe
is it this simple
i would do
/[.?!]+/
because i dont remember other projects in junior section being so hard
the quick and dirty version is, which the "junior" category would imply
right
like, most of the time, word and sentence counting doens't have to be perfect to be usable
just ok enough
if I'm supposed to write a 3000 word essay, no one is going to care if it's 3004 or 2997
if you want perfect, you're diving into the deep deep end
right
and there's a million edge cases
exactly
cause language is a fuck
and you need a dictionary as well
ok if the project was asking us to do make it perfect, under what difficulty level would this project be in
impossible
oh my god
thank god
and im not even joking
i thought i was slow brain
cuz of this project
using a regex with
\b
to find word boundaries is the cleverer solution, because you're offloading figuring out what a word is to whoever wrote the regex engine, but even then \b just looks at the edge between word characters and non-word characters. Word characters are A-Z, a-z, 0-9, and
so this: ` ` is three words according to regular expressionsoh
some writing systems put spaces in long numbers too, so 1 200 122 747 is four words
also, it miserably fails at anything that isn't english
the tags of this project should be " html css js english "
phone numbers in the US:
four words
but it is bloody good for what it is
acção
would be 4 words too
(it means "action" in portuguese)
and if you use the "combination" characters, then it is a lot more
the ç
and the ã
would be 3 words each
but for the size, it's awesome
what would this do
replaces the first instance of
?
and !
for .
to then split by .
oh
im switching to other project bruh😭
it's buggy because
.replace
with a string only replaces once
splitting by /(?:\s*[.?!]\s*)+/
is good enough
that gives you sentences
you can use /(?:\s*(?:[.?!\W_])\s*)+/
to get all the words in the entire textthen the whole project is done
isnt it
basically
hahaha
but, again, this is using regular expressions
the """"""""""expert's"""""""""" way
regex101
regex101: build, test, and debug regex
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
it's good enough
try it
alright
it's matching by what is NOT a word
what's left is what is a word
i can optimize it a fair bit more
[.?!\W_\s]+
if you want what is a word, you can do [^.?!\W_\s]+
this is for wordsReally? I'd assume accented letters would work properly
they don't, because
ç
and ã
aren't a-zA-Z0-9
Huh, til
I thought it'd work like sorting
i was surprised by it too