COMP9315 21T1 – Assignment 1 | My Assignment Tutor

18/03/2021 COMP9315 21T1 – Assignment 1 1/7DeadlineMarksLate PenaltySubmissionPre-requisitesPG_CODEPG_HOMEPG_DATAPG_LOGCOMP9315 21T1 Assignment 1Adding a Set Data Type to PostgreSQLDBMS ImplementationLast updated: Tuesday 9th March 11:05pmMost recent changes are shown in red;older changes are shown in brown.This is a working draft.Small updates and clarifications will be added as needed.No major changes are anticipated.jump to … summary introduction setup intsets ( values operations representing ) changelogsubmissionSummaryFriday 19 March, 9:00pmThis assignment contributes 15 marks toward your total mark for this course.0.089 marks off the ceiling mark for each hour late.After 7 days the assignment is worth 0 marks.From WebCMS: COMP9315 website, -> Assignments, -> Ass1 Specification, -> MakeSubmission, -> upload intset.c, intset.source, [ Makefile]orFrom CSE: give cs9315 ass1 intset.c intset.source [ Makefile]Before starting this assignment, it would be useful to complete Prac Exercise P04.This assignment aims to give you:An understanding of how data is treated inside a DBMSPractice in adding a new base type to PostgreSQLThe goal is to implement a new data type for PostgreSQL, complete with input/output functions and a range ofoperations.Make sure that you read this assignment specification carefully and completely before starting work on theassignment.Questions which indicate that you haven’t done this will simply get the response “Please read the spec”.We will be using the following names in the discussion below:The directory where your PostgreSQL source code is located(typically /srvr/$USER/postgresql-12.5)The directory where you have installed the PostgreSQL binaries(typically /srvr/$USER/pgsql)The directory where you have placed PostgreSQL’s data(typically /srvr/$USER/pgsql/data)the file to where you send PostgreSQL’s log output(typically /srvr/$USER/pgsql/log)IntroductionPostgreSQL has an extensibility model which, among other things, provides a well-defined process for addingnew data types into a PostgreSQL server.This capability has led to the development by PostgreSQL users of a number of types (such as polygons)which have become part of the standard distribution.It also means that PostgreSQL is the database of choice in research projects which aim to push theboundaries of what kind of data a DBMS can manage.18/03/2021 COMP9315 21T1 – Assignment 1 2/7In this assignment, we will be adding a new data type for dealing with sets of integers.You may implement the functions for the data type in any way you like, provided that they satisfy thesemantics given below.(in the intSets section)Note that arrays in PostgreSQL have some properties and operations that make them look a little bit like sets.However, they are not sets and have quite different semantics to the data type that we are asking you toimplement.The process for adding new base data types in PostgreSQL is described in the following sections of thePostgreSQL documentation:37.10 C-Language Functions37.13 User-defined Types37.14 User-defined OperatorsSQL: CREATE TYPESQL: CREATE OPERATORSQL: CREATE OPERATOR CLASSSection 37.13 uses an example of a complex number type, which you can use as a starting point for definingyour intSet data type (see below). Note that the complex type is a starting point only, to give an idea on hownew data types are added. Don’t be fooled into thinking that this assignment just requires you to change thename complex to intSet; the intSet type is more complex (no pun intended) than the complex numbertype.There are other examples of new data types under the tutorial and contrib directories. These may ormay not give you some useful ideas on how to implement the intSet data type:An auto-encrypted password datatypePG_CODE/contrib/chkpass/A case-insensitive character string datatypePG_CODE/contrib/citext/A confidence-interval datatypePG_CODE/contrib/seg/Setting UpYou ought to start this assignment with a fresh copy of PostgreSQL, without any changes that you might havemade for the Prac exercises (unless these changes are trivial).Note that you only need to configure, compile, and install your PostgreSQL server once for this assignment.All subsequent compilation takes place in the src/tutorial directory, and only requires modification of thefiles there.Once you have re-installed your PostgreSQL server, you should run the following commands:$ cd PG_CODE/src/tutorial$ cp complex.c intset.c$ cp complex.source intset.sourceOnce you’ve made the intset files, you should also edit the Makefile in this directory, and add the greentext to the following lines:MODULES = complex funcs intsetDATA_built = advanced.sql basics.sql complex.sql funcs.sql syscat.sql intset.sqlThe rest of the work for this assignment involves editing the intset.c and intset.source files.In order for the Makefile to work properly, you must use the identifier _OBJWD_ in the intset.source fileto refer to the directory holding the compiled library.You should never directly modify the intset.sql file produced by the Makefile.18/03/2021 COMP9315 21T1 – Assignment 1 3/7If you want to use other *.c files along with intset.c, then you can do so, but you will need to make furtherchanges to the Makefile to ensure that they are compiled and linked correctly into the library.Note that your submitted versions of intset.c and intset.source should not contain any references tothe complex type (because that’s not what you’re implementing).Make sure that the comments in the program describes the code that you wrote.Also, do not put testing queries in your intset.source; all it should do is create the new data type. Put anytesting you want to do in a separate *.sql, which you don’t need to submit. And do not drop the intSet typeat the end of intset.source. If you do, your data type will vanish before we have a chance to test it.The intSet Data TypeWe aim to define a new base type intSet, to store the notion of sets of integer values.We also aim to define a useful collection of operations on the intSet type.How you represent intSet values, and implement functions to manipulate them, is up to you.However, they must satisfy the requirements belowOnce implemented correctly, you should be able to use your PostgreSQL server to build the following kind ofSQL applications:create table Features (id integer primary key,name text);create table DBSystems (name text primary key,features intSet);insert into Features (id, name) values(1, ‘well designed’),(2, ‘efficient’),(3, ‘flexible’),(4, ‘robust’);insert into DBSystems (name, features) values(‘MySQL’, ‘{}’),(‘MongoDB’, ‘{}’),(‘Oracle’, ‘{2,4}’),(‘PostgreSQL’, ‘{1,2,3,4}’);intSet valuesIn mathematics, we represent a set as a curly-bracketed, comma-separated collection of values:Such a set contains only distinct values, and no particular ordering can be imposed.Our intSet values can be represented similarly.We can have a comma-separated list of non-negative integers, surrounded by a set of curly braces, which ispresented to and by PostgreSQL as a string.For example:‘{ 1, 2, 3, 4, 5 }’.Whitespace should not matter, so ‘{1,2,3}’ and ‘{ 1, 2, 3 }’ are equivalent.Similarly, a set contains distinct values, so ‘{1,1,1,1,1}’ is equivalent to ‘{1}’.{1, 2, 3, 4, 5}18/03/2021 COMP9315 21T1 – Assignment 1 4/7i ? S# SA >@ BA @< BA = BA BA && BA || BA !! BAnd ordering is irrelevant, so ‘{1,2,3}’ is equivalent to ‘{3,2,1}’.The integer values in the set are assumed to consist of a sequence of digits. There are no + or – signs.There can be leading zeroes, but they should effectively be ignored, e.g. 0001 should be treated the same as1.You may not assume a fixed limit to the size of the set.It may contain zero or more elements, bounded by the database’s capacity to store values.You may assume that each interger value will be less than each element in the set will be less than . Valid intSets‘{ }’‘{2,3,1}’‘{6,6,6,6,6,6}’‘{10, 9, 8, 7, 6,5,4,3,2,1}’‘{1, 999, 13, 666, 5}’‘{ 1 , 3 , 5 , 7,9 }’‘{1, 01, 001, 0001}’ (same as ‘{1}’)‘{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}’‘{1,2,3,4,5,6,7,8,9,10,(and then all numbers to),9999,10000}’‘ {1,5,7,9}’‘{2,4,6, 8 } ‘‘{0}’Invalid intSets‘{a,b,c}’‘{ a, b, c }’‘{1, 2.0, 3}’‘{1, {2,3}, 4}’‘{1, 2, 3, 4, five}’‘{ 1 2 3 4 }’‘1 2 3 4’‘1 2 3}’‘{ -1 }’‘{1,2,3’‘1,2,3,4,5’‘{{1,2,3,5}’‘{7,17,,27,37}’‘{1,2,3,5,8,}’ Operations on intSetsYou must implement all of the following operations for the intSet type.(assuming , , and are intSets, and is an integer):intSet contains the integer ;That is, .Give the cardinality, or number of distinct elements in, intSet ;That is, .Does intSet contain all the values in intSet ?ie, for every element of , is it an element of ?That is, the improper superset ( )Does intSet contain only values in intSet ?ie, for every element of , is it an element of ?That is, the improper subset ( ) intSetsandare equal;contains all the values of intSet and intSet contains all the valuesB BThat is, intSetof intSet ;AThat is, every element in can be found in , and vice versa.A BintSetsandare not equal;doesn’t contain all the values of intSet or intSet doesn’t contain allB BThat is, intSetthe values of intSet ;That is, some element incannot be found in , or vice versa. Takes the set intersection, and produces an intSet containing the elements common toand ;That is, .Takes the set union, and produces an intSet containing all the elements of and ;That is, .231 – 1A B S iS ii ∈ SS|S|A BB AA ⊇ BA BA BA ⊆ BA BA A BA AA BA BA ∩ BA BA ∪ B18/03/2021 COMP9315 21T1 – Assignment 1 5/7A – BTakes the set disjunction, and produces an intSet containing elements that are in andnot in ,or that are in and not in .Takes the set difference, and produces an intSet containing elements that are in andnot in .Note that this is not the same as A !! B.Below are examples of how you might use intSets, to illustrate the semantics.You can use these as an initial test set; we will supply a more comprehensive test suite later.db=# create table mySets (id integer primary key, iset intSet);CREATE TABLEdb=# insert into mySets values (1, ‘{1,2,3}’);INSERT 0 1db=# insert into mySets values (2, ‘{1,3,1,3,1}’);INSERT 0 1db=# insert into mySets values (3, ‘{3,4,5}’);INSERT 0 1db=# insert into mySets values (4, ‘{4,5}’);INSERT 0 1db=# select * from mySets order by id;id | iset—-+———1 | {1,2,3}2 | {1,3}3 | {3,4,5}4 | {4,5}(4 rows)— get all pairs of tuples where the second iset is a subset of first isetdb=# select a.*, b.* from mySets a, mySets bdb-# where (b.iset @< a.iset) and !=;id | iset | id | iset—-+———+—-+——-1 | {1,2,3} | 2 | {1,3}3 | {3,4,5} | 4 | {4,5}(2 rows)— insert extra values into the iset in tuple #4 via uniondb=# update mySets set iset = iset || ‘{5,6,7,8}’ where id = 4;UPDATE 1db=# select * from mySets where id=4;id | iset—-+————-4 | {4,5,6,7,8}(1 row)— tuple #4 is no longer a subset of tuple #3db=# select a.*, b.* from mySets a, mySets bdb-# where (b.iset @< a.iset) and !=;id | iset | id | iset—-+———+—-+——-1 | {1,2,3} | 2 | {1,3}(1 row)— get the cardinality (size) of each intSetdb=# select id, iset, (#iset) as card from mySets order by id;id | iset | card—-+————-+——1 | {1,2,3} | 32 | {1,3} | 23 | {3,4,5} | 3ABB AAB18/03/2021 COMP9315 21T1 – Assignment 1 6/7 4 | {4,5,6,7,8} |(4 rows)5 — form the intersection of each pair of setsdb=# select a.iset, b.iset, a.iset && b.isetdb-# from mySets a, mySets b where <;iset | iset | ?column?———+————-+———-{1,2,3} | {1,3} | {1,3}{1,2,3} | {3,4,5} | {3}{1,2,3} | {4,5,6,7,8} | {}{1,3} | {3,4,5} | {3}{1,3} | {4,5,6,7,8} | {}{3,4,5} | {4,5,6,7,8} | {4,5}(6 rows)db=# delete from mySets where iset @< ‘{1,2,3,4,5,6}’;DELETE 3db=# select * from mySets;id | iset—-+————-4 | {4,5,6,7,8}(1 row)— etc. etc. etc.You should think of some more tests of your own.In particular, make sure that you check that your code works with large intSets (e.g. cardinality ≥ 1000).If you come up with any tests that you think are particularly clever, feel free to post them in the commentssection below.Representing intSetsThe first thing you need to do is to decide on an internal representation for your intSet data type.You should do this after you have understood the description of the operators above.Since what they require may affect how you decide on the representation of your intSet values.Note that because of the requirement that an intSet can be arbitrarily large (see above),you cannot have a representation that uses a fixed-size object to hold values of type intSet.When you read strings representing intSet values, they are converted into your internal form,stored in the database in this form, and operations on intSet values are carried out using this data structure.When you display intSet values, you should show them in a canonical form, regardless of how they wereentered or how they are stored.The canonical form for output (at least) should include no spaces, and should have elements in ascendingorder.The first functions you need to write are ones to read and display values of type intSet.You should write analogues of the functions complex_in() and complex_out() that are defined in the filecomplex.c.Suitable names for these functions would be e.g.intset_in() and intset_out().Make sure that you use the V1 style function interface (as is done in complex.c).Note that the two input/output functions should be complementary, meaning that any string displayed by theoutput function must be able to be read using the input function.There is no requirement for you to retain the precise string that was used for input(e.g. you could store the intSet value internally in canonical form).Note that you are not required to define binary input/output functions called receive_function andsend_function in the PostgreSQL documentation, and called complex_send() and complex_recv() inthe complex.c file.18/03/2021 COMP9315 21T1 – Assignment 1 7/7Hint: test out as many of your C functions as you can outside PostgreSQL(e.g., write a simple test driver)Before you try to install them in PostgreSQL.This will make debugging much easier.You should ensure that your definitions capture the full semantics of the operators(e.g. specify commutativity if the operator is commutative).ChangeLogv1.0 (2021-02-26 15:00:00+10:00)released Assignment 1v1.1 (2021-02-27 11:00:00+10:00)Modify the “Operations on intSets” sectionChange the symbol for “contains” from ‘’ to ‘@@’ symbolAdd the “inequality” operation, using the ‘’ symbolv1.2 (2021-02-26 16:00:00+10:00)Corrected typo in examples of Valid IntSetsAdd addidtional examples of (In)Valid IntSetsAdd an upper bound for the size of each element in an IntSetv1.3 (2021-02-26 18:00:00+10:00)Correct superset and subset symbles used in “Operations on intSets”v1.4 (2021-03-01 18:00:00+10:00)Moved deadline forward to 9pm to be consistent with CSE policySubmissionYou need to submit two files:intset.c – containing the C functions that implement the internals of the intSet data type.intset.source – containing the template SQL commands to install the intSet data type into aPostgreSQL server.Do not submit the intset.sql file, since it contains absolute file names which are not useful in our testenvironment.If your system requires other *.c files, you should submit them, along with the modified Makefile from thesrc/tutorial directory.Do not include:create table …insert into …select …drop type …Or any other statements not directly needed for creating the intSet data type in intset.source.Have fun, jas and dylan.


Leave a Reply

Your email address will not be published. Required fields are marked *