Resources for "Natural Language Processing" Coursera course.

Overview

Natural Language Processing course resources

This github contains practical assignments for Natural Language Processing course by Higher School of Economics: https://www.coursera.org/learn/language-processing. In this course you will learn how to solve common NLP problems using classical and deep learning approaches.

From a practical side, we expect your familiarity with Python, since we will use it for all assignments in the course. Two of the assignments will also involve TensorFlow. You will work with many other libraries, including NLTK, Scikit-learn, and Gensim. You have several options on how to set it up.

1. Running on Google Colab

Google has released its own flavour of Jupyter called Colab, which has free GPUs!

Here's how you can use it:

  1. Open https://colab.research.google.com, click Sign in in the upper right corner, use your Google credentials to sign in.
  2. Click GITHUB tab, paste https://github.com/hse-aml/natural-language-processing and press Enter
  3. Choose the notebook you want to open, e.g. week1/week1-MultilabelClassification.ipynb
  4. Click File -> Save a copy in Drive... to save your progress in Google Drive
  5. If you need a GPU, click Runtime -> Change runtime type and select GPU in Hardware accelerator box
  6. Execute the following code in the first cell that downloads dependencies (change for your week number):
! wget https://raw.githubusercontent.com/hse-aml/natural-language-processing/master/setup_google_colab.py -O setup_google_colab.py
import setup_google_colab
# please, uncomment the week you're working on
# setup_google_colab.setup_week1()  
# setup_google_colab.setup_week2()
# setup_google_colab.setup_week3()
# setup_google_colab.setup_week4()
# setup_google_colab.setup_project()
# setup_google_colab.setup_honor()
  1. If you run many notebooks on Colab, they can continue to eat up memory, you can kill them with ! pkill -9 python3 and check with ! nvidia-smi that GPU memory is freed.

Known issues:

  • No support for ipywidgets, so we cannot use fancy tqdm progress bars. For now, we use a simplified version of a progress bar suitable for Colab.
  • Blinking animation with IPython.display.clear_output(). It's usable, but still looking for a workaround.
  • If you see an error "No module named 'common'", make sure you've uncommented the assignment-specific line in step 6, restart your kernel and execute all cells again

2. Running locally

Two options here:

  1. Use the Docker container of our course. It already has all libraries, that you will need. The setup for you is very simple: install Docker application depending on your OS, download our container image, run everything within the container. Please, see this detailed Docker tutorial.

  2. Manually install all the libraries depending on your OS (each task contains a list of needed libraries in the very beginning). If you use Windows/MacOS you might find useful Anaconda distribution which allows to install easily most of the needed libraries. However, some tools, like StarSpace for week 2, are not compatible with Windows, so it's likely that you will have to use Docker anyways, if you go for these tasks.

It might take a significant amount of time and resources to run the assignments code, but we expect that an average laptop is enough to accomplish the tasks. All assignments were tested in the Docker on Mac with 8GB RAM. If you have memory errors, that could be caused by not tested configurations or inefficient code. Consider reporting these cases or double-checking your code.

If you want to run the code of the course on the AWS machine, we've prepared the AWS tutorial here.

Comments
  • Module common not found

    Module common not found

    Hi,

    module common not found. Tried on colab and on the docker as well.

    here is the traceback:

    `ImportError Traceback (most recent call last) in () 1 import sys 2 sys.path.append("..") ----> 3 from common.download_utils import download_week1_resources 4 5 download_week1_resources()

    ImportError: No module named 'common'`

    opened by fahadshery 9
  • Final submission error

    Final submission error

    I am using google colab for week 3 assignment and at the last when i am finally submitting it, the compiler is not recognizing my E-mail id. Please tell me the solution. ----> 1 STUDENT_EMAIL = [email protected]# EMAIL 2 STUDENT_TOKEN = AT5ZyzLxuQfnEhNg# TOKEN 3 grader.status()

    NameError: name 'mandloi19faraday96' is not defined

    This is the error i am getting how should i submit my assignment please help.

    opened by ayushmandloi2 3
  • Problem opening 'data/text_prepare_tests.tsv' file

    Problem opening 'data/text_prepare_tests.tsv' file

    Using the docker container environment I am getting a UnicodeDecodeError. More speciffically:

    prepared_questions = []
    for line in open('data/text_prepare_tests.tsv'):
         line = text_prepare(line.strip())
         prepared_questions.append(line)
    text_prepare_results = '\n'.join(prepared_questions)
    grader.submit_tag('TextPrepare', text_prepare_results)
    

    Is giving the following error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 79: ordinal not in range(128)

    In order to run it I had to change it to:

    prepared_questions = []
    for line in open('data/text_prepare_tests.tsv', encoding='utf-8'):
         line = text_prepare(line.strip())
         prepared_questions.append(line)
    text_prepare_results = '\n'.join(prepared_questions)
    grader.submit_tag('TextPrepare', text_prepare_results)
    

    Can also be solved by using pd.read_csv.

    Is this error reproducible to anyone else?

    opened by mpizosdim 3
  • Week 1: Invalid Syntax

    Week 1: Invalid Syntax

    Hello, This code returns an "Invalid Syntax" error.

    `REPLACE_BY_SPACE_RE = re.compile('[/(){}[]|@,;]') BAD_SYMBOLS_RE = re.compile('[^0-9a-z #+_]') STOPWORDS = set(stopwords.words('english'))

    def text_prepare(text): """ text: a string

        return: modified initial string
    """
    text = # lowercase text
    text = # replace REPLACE_BY_SPACE_RE symbols by space in text
    text = # delete symbols which are in BAD_SYMBOLS_RE from text
    text = # delete stopwords from text
    return text`
    
    opened by ashraf21c 2
  • Bump tensorflow from 1.15.0 to 2.3.1 in /docker

    Bump tensorflow from 1.15.0 to 2.3.1 in /docker

    Bumps tensorflow from 1.15.0 to 2.3.1.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.3.1

    Release 2.3.1

    Bug Fixes and Other Changes

    TensorFlow 2.3.0

    Release 2.3.0

    Major Features and Improvements

    • tf.data adds two new mechanisms to solve input pipeline bottlenecks and save resources:

    In addition checkout the detailed guide for analyzing input pipeline performance with TF Profiler.

    • tf.distribute.TPUStrategy is now a stable API and no longer considered experimental for TensorFlow. (earlier tf.distribute.experimental.TPUStrategy).

    • TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.

    • Introduces experimental support for Keras Preprocessing Layers API (tf.keras.layers.experimental.preprocessing.*) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.

    • TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for XNNPACK, a highly optimized set of CPU kernels, as well as opt-in support for executing quantized models on the GPU.

    • Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages.

    • The experimental Python API tf.debugging.experimental.enable_dump_debug_info() now allows you to instrument a TensorFlow program and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called Debugger V2, which reveals the details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.

    Breaking Changes

    • Increases the minimum bazel version required to build TF to 3.1.0.
    • tf.data
      • Makes the following (breaking) changes to the tf.data.
      • C++ API: - IteratorBase::RestoreInternal, IteratorBase::SaveInternal, and DatasetBase::CheckExternalState become pure-virtual and subclasses are now expected to provide an implementation.
      • The deprecated DatasetBase::IsStateful method is removed in favor of DatasetBase::CheckExternalState.
      • Deprecated overrides of DatasetBase::MakeIterator and MakeIteratorFromInputElement are removed.

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.3.1

    Bug Fixes and Other Changes

    Release 2.2.1

    ... (truncated)

    Commits
    • fcc4b96 Merge pull request #43446 from tensorflow-jenkins/version-numbers-2.3.1-16251
    • 4cf2230 Update version numbers to 2.3.1
    • eee8224 Merge pull request #43441 from tensorflow-jenkins/relnotes-2.3.1-24672
    • 0d41b1d Update RELEASE.md
    • d99bd63 Insert release notes place-fill
    • d71d3ce Merge pull request #43414 from tensorflow/mihaimaruseac-patch-1-1
    • 9c91596 Fix missing import
    • f9f12f6 Merge pull request #43391 from tensorflow/mihaimaruseac-patch-4
    • 3ed271b Solve leftover from merge conflict
    • 9cf3773 Merge pull request #43358 from tensorflow/mm-patch-r2.3
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 2
  • Week 1 incorrect answers in test_my_bag_of_words

    Week 1 incorrect answers in test_my_bag_of_words

    In the function test_my_bag_of_words, answers is defined as a list of list while it should be just list

    Original: answers = [[1, 1, 0, 1]]

    Should be: answers = [1, 1, 0, 1] as it is a return of my_bag_of_words function which takes text as input and returns np array

    opened by MohannadEhabBarakat 2
  • main_bot struggles if you have non-ascii characters in your name

    main_bot struggles if you have non-ascii characters in your name

    If you name contains any funny characters in Telegram, the bot will crash

    Ready to talk!
    An update received.
    Traceback (most recent call last):
      File "main_bot.py", line 111, in <module>
        main()
      File "main_bot.py", line 103, in main
        print("Update content: {}".format(update))
    UnicodeEncodeError: 'ascii' codec can't encode character '\xf8' in position 153: ordinal not in range(128)
    

    Although adding some more computational complexity, adding the following function

    def cast_to_utf_8(old_dict):
        """
        Encodes the string content of a dict to utf-8
    
        Parameters
        ----------
        old_dict : dict
            The dict to encode
    
        Returns
        -------
        new_dict : dict
            The encoded dict
        """
    
        def walk(node):
            """
            Recursively traverses a node ande encodes all strings to utf-8
    
            Parameters
            ----------
            node : dict
                The node to traverse
    
            Returns
            -------
            node : dict
                The node where the strings are encoded to utf-8
            """
            for key, item in node.items():
                if type(item)==dict:
                    walk(item)
                elif type(item)==list:
                    for i, elem in enumerate(item):
                        if type(elem) == str:
                            node[key][i] = elem.encode('utf-8')
                elif type(item)==str:
                    node[key] = item.encode('utf-8')
            return node
    
        new_dict = walk(old_dict)
    
        return new_dict
    

    and calling it like this in main()

                        if is_unicode(text):
                            update = cast_to_utf_8(update)
                            print("Update content: {}".format(update))
                            bot.send_message(chat_id, bot.get_answer(update["message"]["text"]))
                        else:
                            bot.send_message(chat_id, "Hmm, you are sending some weird characters to me...")
    

    was a remedy for me

    opened by loeiten 2
  • Bump nbconvert from 5.3.1 to 6.3.0 in /docker

    Bump nbconvert from 5.3.1 to 6.3.0 in /docker

    Bumps nbconvert from 5.3.1 to 6.3.0.

    Release notes

    Sourced from nbconvert's releases.

    5.5.0 Documentation Release

    This tag is used to provide a working documentation build for RTD.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump html5lib from 0.9999999 to 0.99999999 in /docker

    Bump html5lib from 0.9999999 to 0.99999999 in /docker

    Bumps html5lib from 0.9999999 to 0.99999999.

    Changelog

    Sourced from html5lib's changelog.

    Commits
    • ebf6225 0.99999999 release! Let's party!
    • a8ba43e Merge pull request #270 from gsnedders/rename_stuff
    • 8cb144b Update the docs after all the renaming and add CHANGES
    • 00977d6 Rename a bunch of serializer module variables to be underscore prefixed
    • 18a7102 Have only one set of allowed elements/attributes for the sanitizer
    • c4dd677 Move a whole bunch of private modules to be underscore prefixed
    • 8db5828 Rename treewalkers.lxmletree to .etree_lxml for consistency
    • 1a61c44 Rename treewalkers.genshistream to .genshi for consistency
    • 6c30d0b Move serializer.htmlserializer to serializer
    • 7bbde54 Rename filters._base to .base to reflect public status
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump tensorflow from 1.15.0 to 2.7.2 in /docker

    Bump tensorflow from 1.15.0 to 2.7.2 in /docker

    Bumps tensorflow from 1.15.0 to 2.7.2.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.7.2

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    TensorFlow 2.7.1

    Release 2.7.1

    This releases introduces several vulnerability fixes:

    • Fixes a floating point division by 0 when executing convolution operators (CVE-2022-21725)
    • Fixes a heap OOB read in shape inference for ReverseSequence (CVE-2022-21728)
    • Fixes a heap OOB access in Dequantize (CVE-2022-21726)
    • Fixes an integer overflow in shape inference for Dequantize (CVE-2022-21727)
    • Fixes a heap OOB access in FractionalAvgPoolGrad (CVE-2022-21730)
    • Fixes an overflow and divide by zero in UnravelIndex (CVE-2022-21729)
    • Fixes a type confusion in shape inference for ConcatV2 (CVE-2022-21731)
    • Fixes an OOM in ThreadPoolHandle (CVE-2022-21732)
    • Fixes an OOM due to integer overflow in StringNGrams (CVE-2022-21733)
    • Fixes more issues caused by incomplete validation in boosted trees code (CVE-2021-41208)
    • Fixes an integer overflows in most sparse component-wise ops (CVE-2022-23567)
    • Fixes an integer overflows in AddManySparseToTensorsMap (CVE-2022-23568)

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    • Fixes a code injection in saved_model_cli (CVE-2022-29216)
    • Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)
    • Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)
    • Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)
    • Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)
    • Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)
    • Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)
    • Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)
    • Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)
    • Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)
    • Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)
    • Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)
    • Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)
    • Fixes a segfault due to missing support for quantized types (CVE-2022-29205)
    • Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

    ... (truncated)

    Commits
    • dd7b8a3 Merge pull request #56034 from tensorflow-jenkins/relnotes-2.7.2-15779
    • 1e7d6ea Update RELEASE.md
    • 5085135 Merge pull request #56069 from tensorflow/mm-cp-52488e5072f6fe44411d70c6af09e...
    • adafb45 Merge pull request #56060 from yongtang:curl-7.83.1
    • 01cb1b8 Merge pull request #56038 from tensorflow-jenkins/version-numbers-2.7.2-4733
    • 8c90c2f Update version numbers to 2.7.2
    • 43f3cdc Update RELEASE.md
    • 98b0a48 Insert release notes place-fill
    • dfa5cf3 Merge pull request #56028 from tensorflow/disable-tests-on-r2.7
    • 501a65c Disable timing out tests
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump tensorflow from 1.15.0 to 2.6.4 in /docker

    Bump tensorflow from 1.15.0 to 2.6.4 in /docker

    Bumps tensorflow from 1.15.0 to 2.6.4.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.6.4

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    TensorFlow 2.6.3

    Release 2.6.3

    This releases introduces several vulnerability fixes:

    • Fixes a floating point division by 0 when executing convolution operators (CVE-2022-21725)
    • Fixes a heap OOB read in shape inference for ReverseSequence (CVE-2022-21728)
    • Fixes a heap OOB access in Dequantize (CVE-2022-21726)
    • Fixes an integer overflow in shape inference for Dequantize (CVE-2022-21727)
    • Fixes a heap OOB access in FractionalAvgPoolGrad (CVE-2022-21730)
    • Fixes an overflow and divide by zero in UnravelIndex (CVE-2022-21729)
    • Fixes a type confusion in shape inference for ConcatV2 (CVE-2022-21731)
    • Fixes an OOM in ThreadPoolHandle (CVE-2022-21732)
    • Fixes an OOM due to integer overflow in StringNGrams (CVE-2022-21733)
    • Fixes more issues caused by incomplete validation in boosted trees code (CVE-2021-41208)
    • Fixes an integer overflows in most sparse component-wise ops (CVE-2022-23567)
    • Fixes an integer overflows in AddManySparseToTensorsMap (CVE-2022-23568)
    • Fixes a number of CHECK-failures in MapStage (CVE-2022-21734)

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    Release 2.8.0

    Major Features and Improvements

    • tf.lite:

      • Added TFLite builtin op support for the following TF ops:
        • tf.raw_ops.Bucketize op on CPU.
        • tf.where op for data types tf.int32/tf.uint32/tf.int8/tf.uint8/tf.int64.
        • tf.random.normal op for output data type tf.float32 on CPU.
        • tf.random.uniform op for output data type tf.float32 on CPU.
        • tf.random.categorical op for output data type tf.int64 on CPU.
    • tensorflow.experimental.tensorrt:

      • conversion_params is now deprecated inside TrtGraphConverterV2 in favor of direct arguments: max_workspace_size_bytes, precision_mode, minimum_segment_size, maximum_cached_engines, use_calibration and

    ... (truncated)

    Commits
    • 33ed2b1 Merge pull request #56102 from tensorflow/mihaimaruseac-patch-1
    • e1ec480 Fix build due to importlib-metadata/setuptools
    • 63f211c Merge pull request #56033 from tensorflow-jenkins/relnotes-2.6.4-6677
    • 22b8fe4 Update RELEASE.md
    • ec30684 Merge pull request #56070 from tensorflow/mm-cp-adafb45c781-on-r2.6
    • 38774ed Merge pull request #56060 from yongtang:curl-7.83.1
    • 9ef1604 Merge pull request #56036 from tensorflow-jenkins/version-numbers-2.6.4-9925
    • a6526a3 Update version numbers to 2.6.4
    • cb1a481 Update RELEASE.md
    • 4da550f Insert release notes place-fill
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump certifi from 2017.11.5 to 2022.12.7 in /docker

    Bump certifi from 2017.11.5 to 2022.12.7 in /docker

    Bumps certifi from 2017.11.5 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump tensorflow from 1.15.0 to 2.9.3 in /docker

    Bump tensorflow from 1.15.0 to 2.9.3 in /docker

    Bumps tensorflow from 1.15.0 to 2.9.3.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump protobuf from 3.5.0.post1 to 3.18.3 in /docker

    Bump protobuf from 3.5.0.post1 to 3.18.3 in /docker

    Bumps protobuf from 3.5.0.post1 to 3.18.3.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.18.3

    C++

    Protocol Buffers v3.16.1

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.2

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.1

    Python

    • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
    • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

    Ruby

    • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

    Protocol Buffers v3.18.0

    C++

    • Fix warnings raised by clang 11 (#8664)
    • Make StringPiece constructible from std::string_view (#8707)
    • Add missing capability attributes for LLVM 12 (#8714)
    • Stop using std::iterator (deprecated in C++17). (#8741)
    • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
    • Fix #7047 Safely handle setlocale (#8735)
    • Remove deprecated version of SetTotalBytesLimit() (#8794)
    • Support arena allocation of google::protobuf::AnyMetadata (#8758)
    • Fix undefined symbol error around SharedCtor() (#8827)
    • Fix default value of enum(int) in json_util with proto2 (#8835)
    • Better Smaller ByteSizeLong
    • Introduce event filters for inject_field_listener_events
    • Reduce memory usage of DescriptorPool
    • For lazy fields copy serialized form when allowed.
    • Re-introduce the InlinedStringField class
    • v2 access listener
    • Reduce padding in the proto's ExtensionRegistry map.
    • GetExtension performance optimizations
    • Make tracker a static variable rather than call static functions
    • Support extensions in field access listener
    • Annotate MergeFrom for field access listener
    • Fix incomplete types for field access listener
    • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
    • Reduce binary size due to fieldless proto messages
    • TextFormat: ParseInfoTree supports getting field end location in addition to start.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump nbconvert from 5.3.1 to 6.5.1 in /docker

    Bump nbconvert from 5.3.1 to 6.5.1 in /docker

    Bumps nbconvert from 5.3.1 to 6.5.1.

    Release notes

    Sourced from nbconvert's releases.

    Release 6.5.1

    No release notes provided.

    6.5.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/jupyter/nbconvert/compare/6.4.5...6.5

    6.4.3

    What's Changed

    New Contributors

    Full Changelog: https://github.com/jupyter/nbconvert/compare/6.4.2...6.4.3

    6.4.0

    What's Changed

    New Contributors

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump mistune from 0.8.1 to 2.0.3 in /docker

    Bump mistune from 0.8.1 to 2.0.3 in /docker

    Bumps mistune from 0.8.1 to 2.0.3.

    Release notes

    Sourced from mistune's releases.

    Version 2.0.2

    Fix escape_url via lepture/mistune#295

    Version 2.0.1

    Fix XSS for image link syntax.

    Version 2.0.0

    First release of Mistune v2.

    Version 2.0.0 RC1

    In this release, we have a Security Fix for harmful links.

    Version 2.0.0 Alpha 1

    This is the first release of v2. An alpha version for users to have a preview of the new mistune.

    Version 0.8.4

    • Support an escaped pipe char in a table cell. #150
    • Fix ordered and unordered list. #152
    • Fix spaces between = in HTML tags
    • Add max_recursive_depth for list and blockquote.
    • Fix fences code block.
    Changelog

    Sourced from mistune's changelog.

    Changelog

    Here is the full history of mistune v2.

    Version 2.0.4

    
    Released on Jul 15, 2022
    
    • Fix url plugin in &lt;a&gt; tag
    • Fix * formatting

    Version 2.0.3

    Released on Jun 27, 2022

    • Fix table plugin
    • Security fix for CVE-2022-34749

    Version 2.0.2

    
    Released on Jan 14, 2022
    

    Fix escape_url

    Version 2.0.1

    Released on Dec 30, 2021

    XSS fix for image link syntax.

    Version 2.0.0

    
    Released on Dec 5, 2021
    

    This is the first non-alpha release of mistune v2.

    Version 2.0.0rc1

    Released on Feb 16, 2021

    Version 2.0.0a6

    
    </tr></table> 
    

    ... (truncated)

    Commits
    • 3f422f1 Version bump 2.0.3
    • a6d4321 Fix asteris emphasis regex CVE-2022-34749
    • 5638e46 Merge pull request #307 from jieter/patch-1
    • 0eba471 Fix typo in guide.rst
    • 61e9337 Fix table plugin
    • 76dec68 Add documentation for renderer heading when TOC enabled
    • 799cd11 Version bump 2.0.2
    • babb0cf Merge pull request #295 from dairiki/bug.escape_url
    • fc2cd53 Make mistune.util.escape_url less aggressive
    • 3e8d352 Version bump 2.0.1
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
Advanced Machine Learning specialisation by HSE
Advanced Machine Learning specialisation by HSE
SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time

SentimentArcs - Emotion in Text An end-to-end pipeline based on Jupyter notebooks to detect, extract, process and anlayze emotion over time in text. E

jon_chun 14 Dec 19, 2022
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar

ASYML 726 Dec 30, 2022
Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

Subhadeep Mandal 1 Feb 01, 2022
Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Hans Alemão 4 Jul 20, 2022
CCF BDCI BERT系统调优赛题baseline(Pytorch版本)

CCF BDCI BERT系统调优赛题baseline(Pytorch版本) 此版本基于Pytorch后端的huggingface进行实现。由于此实现使用了Oneflow的dataloader作为数据读入的方式,因此也需要安装Oneflow。其它框架的数据读取可以参考OneflowDataloade

Ziqi Zhou 9 Oct 13, 2022
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

BROS (BERT Relying On Spatiality) is a pre-trained language model focusing on text and layout for better key information extraction from documents. Given the OCR results of the document image, which

Clova AI Research 94 Dec 30, 2022
Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

Simple bots or Simbots is a library designed to create simple chat bots using the power of python. This library utilises Intent, Entity, Relation and

14 Dec 15, 2021
Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Ankur Dhuriya 10 Oct 13, 2022
In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Transformers are all you need In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a

Aymen Berriche 8 Apr 13, 2022
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect

Intel Labs 2.9k Jan 02, 2023
Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Diaformer Diaformer: Automatic Diagnosis via Symptoms Sequence Generation (AAAI 2022) Diaformer is an efficient model for automatic diagnosis via symp

Junying Chen 20 Dec 13, 2022
BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

Andrew Tavis McAllister 41 Dec 27, 2022
Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

"# bpe_algorithm_can_finetune_tokenizer" this is an implyment for https://github

张博 1 Feb 02, 2022
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Alexander Veysov 3.2k Dec 31, 2022
PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

Facebook Research 605 Jan 02, 2023
PIZZA - a task-oriented semantic parsing dataset

The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.

17 Dec 14, 2022
BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia. Its intended use is as input for neural models in natural languag

Benjamin Heinzerling 1.1k Jan 03, 2023
A demo of chinese asr

chinese_asr_demo 一个端到端的中文语音识别模型训练、测试框架 具备数据预处理、模型训练、解码、计算wer等等功能 训练数据 训练数据采用thchs_30,

4 Dec 09, 2021
Datasets of Automatic Keyphrase Extraction

This repository contains 20 annotated datasets of Automatic Keyphrase Extraction made available by the research community. Following are the datasets and the original papers that proposed them. If yo

LIAAD - Laboratory of Artificial Intelligence and Decision Support 163 Dec 23, 2022
🏖 Easy training and deployment of seq2seq models.

Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear

Axel Springer Ideas Engineering GmbH 231 Nov 18, 2022