Project 2

Due: July 2, before midnight.

Important Reminder: As per the course Academic Honesty Statement, cheating of any kind will minimally result in your letter grade for the entire course being reduced by one level.

This document first provides the aims of this project. It then lists the requirements as explicitly as possible. This is followed by a log which should help you understand the requirements. Finally, it provides some hints as to how those requirements can be met.

Aims

The aims of this project are as follows:

  • To give you more experience with JavaScript/TypeScript programming.

  • To familiarize you with asynchronous programming.

  • To expose you to mongodb.

  • To practice TDD by writing a few tests.

Requirements

You must push a submit/prj2-sol directory to your github repository such that typing npm ci within that directory followed by tsc is sufficient to run the project using ./dist/index.js.

You are being provided with an index.ts which provides the required command-line behavior. What you specifically need to do is add code to the provided src/lib/spreadsheet-dao.ts source file as per the requirements in that file. Database errors should be returned with code set to DB using the same error handling convention as in your previous project.

You are also required to add a test marked with TODO in test/spreadsheet-dao.ts. Your implementation must pass all the tests.

As in your previous project, the emphasis is on correctness rather than efficiency.

The command-line behavior of the program is illustrated in this annotated log.

Provided Files

You are being provided with a directory prj2-sol directory which contains a start for your project. It contains the following files:

spreadsheet-dao.ts

This is the file which you will need to modify. It contains skeletons for the methods you need to implement. You should feel free to add any auxiliary function or class definitions as needed.

spreadsheet.ts

This is an extended version of a solution to your previous project. Changes include:

  • The API is synchronous rather than asynchronous.

  • Additional functionality allowing deleting and copying of spreadsheet cells as well as clearing and dumping the entire spreadsheet.

  • The code is based on the version of the solution which used a multi-pass validation first approach to avoid error recovery.

You should not need to modify this file.

ss-services.ts

This file uses the domain logic implemented in spreadsheet.ts to validate service requests. If validated, then the request is persisted using spreadsheet-dao.ts.

index.ts and cli.ts

These files provide command line handling to dispatch user commands to the spreadsheet services. It imports spreadsheet-dao.ts. You should not need to modify these files.

expr-parser.ts

A parser for spreadsheet formulae similar to that provided for the previous project. You should not need to refer to this file at all.

utils.ts and limits.ts

Utility routines and definitions for the spreadsheet limits. You should not need to modify these files.

tsconfig.json

A configuration file for typescript. You may modify this file if necessary.

README

A README file which must be submitted along with your project. It contains an initial header which you must complete (replace the dummy entries with your name, B-number and email address at which you would like to receive project-related email). After the header you may include any content which you would like read during the grading of your project.

The test directory contains tests files for the above modules. The tests files relevant to this project are:

spreadsheet-mem-dao.ts

Contains code to set up and tear down instances of the DAO using the mongo server which stores all data only in memory.

spreadsheet-dao.ts

Tests for the DAO functionality. You are required to implement one test in this file.

The extras directory contains a LOG file which illustrates the command-line behavior of your program.

MongoDB

MongoDB is a popular nosql database. It allows storage of collections of documents to be accessed by a primary key named _id. This sections gives an overview, not all of which may be relevant to this project.

In terms of JavaScript, mongodb documents correspond to arbitrarily nested JavaScript Objects having a top-level _id property which is used as a primary key. If an object does not have an _id property, then one will be created with a unique value assigned by mongodb.

  • MongoDB provides a basic repertoire of CRUD Operations.

  • All asynchronous mongo library functions can be called directly using await.

  • It is important to ensure that all database connections are closed. Otherwise your program will not exit gracefully.

You can play with mongo by starting up a mongo shell:

$ mongosh
Current Mongosh Log ID:	...
...
For mongosh info see: https://docs.mongodb.com/mongodb-shell/
...
> help

  Shell Help:

    use DB_NAME                    #Set current database
    ...
    ...
>
  • Important: Since mongodb is available for different languages and versions, when looking at documentation make sure that you are looking at the nodejs documentation and API for v5.x. This is very important as the top results from web searches will often point you to inapplicable documentation.

  • It is important to emphasize that mongo is promise-ready. Hence if a mongo function is documented to require a callback, then it will return a Promise when called without a callback which can then be used with await.

  • Usually, there is no need to explicitly create a database or collection in mongo; both will be created transparently when data is first inserted into a collection in the database. The exception is when the newly created database or collection requires non-default attributes.

  • Any attempt to drop a non-existing collection will fail with an error. If this is a possibility in your design, then you should check whether a collection exists before attempting to drop it. Note that it is possible to list out the collections in a database by using the synchronous db.listCollections() method followed by an asynchronous toArray() on the resulting cursor.

  • Since opening a connection to a database is an expensive operation, it is common to open up a connection at the start of a program and hang on to it for the duration of the program. It is also important to remember to close the connection before termination of the program (this is handled by the main program calling your implementation of the close() method).

    [Note that the provided command-line program for this project performs only a single command for each program run. This is not typical and will not be the case in future projects.]

  • As mentioned in class, it is impossible to have an async constructor. Hence an instance of SpreadsheetDao is created using an async factory function makeSpreadsheetDao(). This factory function will need to perform the following:

    1. Get a connection to a mongodb server using the mongo client's asynchronous connect() method.

    2. Given the connection to a server, you can get to a specific database using the synchronous db() method.

    3. Give a database, you can get to a specific collection using the synchronous collection() method; it will automatically create it if it does not exist as long as mongo's strict mode is off.

    4. Given a collection, create any necessary indexes using createIndex().

    5. Call the private constructor passing in all information which will be needed by the DAO instance.

  • Given a collection, a large number of collection methods are available. Some of the methods which may be useful in this project include:

    • Given a collection, you can asynchronously find a single document using the findOne() method.

    • Given a collection, you can asynchronously update a document using the updateOne() or findOneAndUpdate() method. Note that the upsert option is useful if you wish to create a new document when there is no existing document to be updated. Note also that the latter can return the document after the update if the returnDocument option is set to mongo.ReturnDocument.AFTER.

    • Given a collection, you can asynchronously delete a single document using deleteOne() or multiple documents from it using deleteMultiple().

    • Given a collection, you can asynchronously insert into it using the insert*() methods.

  • Given a collection, you can asynchronously find() a cursor which meets the criteria specified by a filter to find(). The query can be used to filter the collection; specifically, if the filter specifies a primary key like _id, then the cursor returned by the find() should contain at most one result.

    Given a cursor, you can get all its results as an array using the asynchronous toArray() method.

  • Mongo db indexes can be used to facilitate search. particular, it supports a single text index on each collection.

    Ideally, indexes should be created on a collection before the collection itself is created. Hence when getting the initial connection to a database it is a good idea to check whether a collection exists and create the collection and its indexes when it does not. Note that the check can be done using the aforementioned db.listCollections() method.

  • Note that the documentation lists many methods as returning a Promise<T> where T is some type. The documentation has T as a link which you can follow to get details for the returned type. For example, the insertOne() method is documented as returning a Promise<InsertOneResult<TSchema>> where InsertOneResult is a link which you can follow to discover the details of that type.

Design and Implementation Considerations

The following information may be useful when working on this project.

Database Design

With relational databases, the data corresponding to the rows of a single table has to have essentially the same structure. That is not the case with a mongodb collection which can store documents which may have different structures, though it is probably not a great idea to store documents with totally different different structures within a single collection.

Though it is possible to use mongo indexes to support efficient query of data, keep in mind that indexes need to be maintained and doing so will result in less efficient data updates.

Every mongo document will have an _id property which is a mongodb ObjectId which serves as a primary key for that document. However, that _id property is an implementation detail and should not be exposed externally as it may change for database migrations. When using mongodb via JavaScript it is possible to specify a non-ObjectId property for _id, but that is not possible when using mongodb via TypeScript. Hence other properties should be used to identify a document. In that case, it is a good idea to set up indexes on those properties.

Storing Spreadsheet within mongo

Abstractly, your database should maintain a mapping from a spreadsheet cell specified by the pair [spreadsheetName, cellId] to an expr for that cell. There are several possibilities:

  1. A single collection indexed by spreadsheetName mapping the name of a spreadsheet to a document containing the spreadsheet content. A spreadsheet document would simply be a { [cellId: string]: expr } mapping.

    A disadvantage of this approach is that querying or updating a single cell would access the entire spreadsheet.

  2. A single collection containing {spreadsheetName, cellId, expr} documents. There would be indexes on spreadsheetName and cellId.

  3. A collection per spreadsheet. Each collection would contain {cellId, expr} documents with an index on cellId.

Testing

Testing code which uses a database is always problematic since it is necessary to reset the database to a known state after each test. An in-memory database is ideal for this as it is fast and also resets automatically to empty after each restart. We will use the mongod-memory-server package to faciliate testing.

When a mongod memory server is started up, it provides a mongodb URL. spreadsheet-mem-dao.ts uses that URL to create an instance of the spreadsheet DAO being tested.

It is often necessary to set up a common environment for each test. It is also necessary to ensure that this enviroment is not shared by different tests to ensure that the tests remain independent and can be run in any order or even in parallel. This is done by having the test framework support the setup and teardown of test fixtures. With mocha this is done by using beforeEach() functions to perform the setup and afterEach() functions to perform the teardown.

Since most of the API provided by mongo is asynchronous, the corresponding DAO methods you will be implementing will need to be declared async. Consequently, the tests will also need to be asynchronous.

There are two ways of writing asynchronous tests in mocha:

  1. The legacy method which was used before async and await were added to JavaScript. The testing function is written to take a parameter, typically called done. When the test is run, mocha will pass a function as the value of done. The programmer is required to call done() when the test completes.

        it('should test some async code', done => {
          //test code and expectations; will either need to use
          //callback handlers or promises.
          done();
        });
    
  2. The modern method which uses async and await support. It is sufficient to merely declare the testing function async:

        it('should test some async code', async () => {
          //test code and expectations; can use await
        });
    

It is an error to use both methods within a single test.

Hints

The following points are worth noting:

  • All validation is handled outside the DAO. Hence the only errors which can occur within the DAO are database errors. These are signalled by returning an error Result with code set to DB.

  • Since all errors are being handled by returning Result<T>'s, async functions/methods will be returning a Promise<Result<T>>.

  • Recall that you can specify that only certain test methods run by adding the .only suffix to the method name as in describle.only or it.only. Similarly, you can skip test methods by adding a .skip suffix to the method name.

The following steps are not prescriptive. The order of some of the steps may need to change depending on your chosen database representation. or

  1. Review the material covered in class, specifically the User Authentication example with emphasis on the Auth DAO code.

  2. Understand the project requirements thoroughly. Decide on your database representation.

  3. Set up your project directory and package.json in a manner similar to your previous project.

    Your development dependencies should minimally include @types/chai, @types/mocha, @types/node, @types/mongodb, typescript, mongodb-memory-server chai, mocha and shx.

    Your runtime dependencies should minimally include mongodb and the course libraries:

        https://sunybing:sunybing@zdu.binghamton.edu
            /cs544/lib/cs544-js-utils-0.0.1.tgz
        https://sunybing:sunybing@zdu.binghamton.edu
            /cs544/lib/cs544-node-utils-0.0.1.tgz
    

    Once you have set up your project correctly, you should be able to run the command-line program sufficiently to get an usage message:

        $ node ./dist/index.js
        usage: index.js MONGO_DB_URL SPREADSHEET_NAME CMD [ARGS...]
        Command CMD can be
          clear 
            clear spreadsheet
          ...
          query CELL_ID
            return formula and current value of cell specified by CELL_ID
    
    

    and run the tests with most of the spreadsheet-dao.ts tests failing.

  4. Add code to create the spreadsheet DAO. Ensure that your code passes the provided must catch bad spreadsheet creation test.

  5. Add code to close() the DAO.

  6. Implement the DAO methods. A suggested order is the order in which they are listed in the provided file, but your YMMV depending on your chosen database representation. Activate tests as you implement the corresponding methods (you can usually see which methods a test uses by searching for the string dao.).

    When you implement the clear() method implement the corresponding test (annotated with TODO). You can use the code in the previous must set multiple cell expressions test as a starting point.

  7. Iterate until you meet all requirements. All tests should pass and you should be able to run the command-line program with behavior similar to that of the sample LOG.

  8. Clean up:

    1. Remove any .only or .skip you may have added to the tests.

    2. Comment out or remove any added debugger lines.

    3. Remove any print statements so that the tests run without producing extraneous output.

It is a good idea to commit and push your project periodically whenever you have made significant changes.

Submit as per your previous project. Before submitting, please update your README to document the status of your project:

  • Document known problems. If there are no known problems, explicitly state so.

  • Anything else which you feel is noteworthy about your submission.

If you want to make sure that your github submission is complete, clone your github repo into a new directory, say ~/tmp. You should then be able to do a npm ci to build and run your project.