Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upWord Embeddings Class Added #2198
Conversation
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: ceec83a0-c500-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: 66eb3070-c501-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: 187097e0-c502-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: b7c021d0-c502-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: e5e2b610-c505-11ea-af19-3b271fec1f42 |
|
Please read CONTRIBUTING.md. |
TravisBuddy
commented
Jul 13, 2020
|
Hey @sprintyaf, TravisCI finished with status TravisBuddy Request Identifier: c19c4b70-c507-11ea-af19-3b271fec1f42 |
|
You might run $ flake8 --ignore=E203,W503 --max-complexity=25 --max-line-length=88 --statistics --count .
|
|
Please avoid backslash line termination as discussed in PEP8 because a whitespace character to the right of the backslash can break the script on a change that is invisibile to the user. |
TravisBuddy
commented
Jul 13, 2020
Travis tests have failedHey @sprintyaf, TravisBuddy Request Identifier: c152e230-c50e-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
Travis tests have failedHey @sprintyaf, TravisBuddy Request Identifier: d8000510-c510-11ea-af19-3b271fec1f42 |
TravisBuddy
commented
Jul 13, 2020
Travis tests have failedHey @sprintyaf, TravisBuddy Request Identifier: fc302bc0-c512-11ea-af19-3b271fec1f42 |
|
Hi @cclauss, |
|
assert 70 <= my_function() <= 100, "my_function() should return a result > 70" |
Agree, but my functions return words. It can return any word. You can't test them like this. |
|
|
|
Please add tests to the deterministic functions and place a comment explaining why a doctest is impossible on the non deterministic functions. |
| @@ -33,8 +37,7 @@ def analogy(self, x1: str, x2: str, y1: str) -> str: | |||
| x1, x2, y1 = x1.lower(), x2.lower(), y1.lower() | |||
| error_msg = 'every word must be in the vocabulary' | |||
| assert all(self.is_in_vocab(x) for x in (x1, x2, y1)), error_msg | |||
| result = self._wv.most_similar(positive=[y1, x2], negative=[x1])[0][0] | |||
| return result | |||
| return self._wv.most_similar(positive=[y1, x2], negative=[x1])[0][0] | |||
|
|
|||
| def n_similarity(self, list1: list, list2: list) -> float: | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| @@ -54,8 +56,7 @@ def similarity(self, w1: str, w2: str) -> float: | |||
| w1, w2 = w1.lower(), w2.lower() | |||
| error_msg = 'both words must be in the vocabulary' | |||
| assert self.is_in_vocab(w1) and self.is_in_vocab(w2), error_msg | |||
This comment has been minimized.
This comment has been minimized.
ruppysuppy
Jul 15, 2020
Contributor
If the assertions are for tests, use the doctest format (see here)
This comment has been minimized.
This comment has been minimized.
cclauss
Jul 15, 2020
Member
These assertions are for runtime checks... If the user gives me garbage then I should complain. These garbage values should also be tested in the doctests.
TravisBuddy
commented
Jul 16, 2020
Travis tests have failedHey @sprintyaf, TravisBuddy Request Identifier: 8bfb7cf0-c77f-11ea-999c-bf5bae978323 |
|
@cclauss @ruppysuppy |
| parsed = WordVectors._parse_document(text) | ||
| documents.append(parsed) | ||
| except IOError: | ||
| continue |
This comment has been minimized.
This comment has been minimized.
cclauss
Aug 14, 2020
Member
Please put lines 166 to 173 in a separate function and then do
documents = [get_document_from_file(filename) for filename in glob.glob(full_path)]
| @@ -17,3 +17,5 @@ sklearn | |||
| sympy | |||
| tensorflow | |||
| xgboost | |||
This comment has been minimized.
This comment has been minimized.
cclauss
Aug 14, 2020
Member
Please insert requirements in alphabetical order to reduce the likelihood of duplicate entries.

sprintyaf commentedJul 13, 2020
•
edited
Describe your change:
#2187
I added Word Embeddings Class described in the above-mentioned issue.
Checklist:
Fixes: #{$ISSUE_NO}.