World's most popular travel blog for travel bloggers.

### Key Differences Between Strong Entity and Weak Entity

1. The basic difference between strong entity and a weak entity is that the strong entity has a primary key whereas, a weak entity has the partial key which acts as a discriminator between the entities of a weak entity set.
2. A weak entity always depends on the strong entity for its existence whereas, a strong entity is independent of any other entity’s existence.
3. A strong entity is denoted with a single rectangle and a weak entity is denoted with a double rectangle.
4. The relationship between two strong entities is denoted with single diamond whereas, a relationship between a weak and a strong entity is denoted with double diamond called Identifying Relationship.
5. The strong entity may or may not show the total participation in its relations, but the weak entity always shows total participation in the identifying relationship which is denoted by the double line.
Design ER n Table accordingly

MVD

#### Question:

The following are the relational schemes of Employee, Project and Assigned-to Employee (Emp#, Emp_name, Profession), Project (Proj#, Proj_name, Chief_Architect), Assigned-to (Proj#, Emp#).

Create appropriate samples of each relation according to the question. Write the following queries in SQL.
(i)  Get Emp# of employees working on Project numbered MCS-043.
(ii) Get details of employees working on database projects.
(iii) Finally create an optimal query tree for each query.

Given the following semi-structure data in XML, create the DTD
(Document Type Declaration) for it

<document>
<student>
<NAME>
</student>
<student>
<NAME>
</student>
</document>

What are the different options available for storing XML data?

## Introduction

• Project Name: - Medical Store Management
• This project is used mainly for medical stores to maintain the details of medical store such as stock and account.
• This medical shop management software is so designed as to ease the work load of medical shop professionals. The main feature includes inventory and stock control, accounting, client management

## Scope & Objectives

As this is generic software it can be used by a wide variety of outlets (Retailers) to automate the process of manually maintaining the records related to the subject of maintaining the stock and cash flows.
This project is basically updating the manual chemist inventory System To Automated inventory system, So that organization can manage their record in efficient and organized form.

This software helps you to track all the products of medical shop moreover it’s a medical shop accounting software. Flexible and adaptive software suited to medical shops or stores or pharmacies of any size.
 Project Characteristics:
 Customer management
 Transition management
 Sales management
 Reporting

 The main goal of the application is to maintain the records of purchase, Sales and stock details with cash transaction maintenance.
 Medical Store Management software is very needy for Medical Store . This software help them maintain day to day transactions in computer.

## Background & Specification

A medical store needs to maintain its inventory of medicines and other products using a
computerized system. It is planning to create a network of computers which should be
placed at various sales and cash counters. It also proposes to have a centralized
workstation for the database and system administrators. Customer orders are accepted
at the sales counters which in turn produces a medicine collection challan. The challan
includes the order number, name of medicine, batch number, date of expiry, shelf
number where it is kept and quantity ordered. One order may contain more than one
medicine. As per challan, medicines are put in a basket by a person, who passes it to
billing assistant. Billing assistant checks the medicine is as per the challan, any
shortcoming is either corrected or reported to customer. On receiving conformation
from the customer the bill is generated. The cash counter collects the money as per the
bill and dispenses the medicine to the customer.

## Project Features

 There Will be two user Accessing System System
 Manager : Who will act As Administrator
 Other Staff: Who Will accessing the System
 User The Features for manager are
 Add ,delete update any product
 Manage store i-e(put price,make salaries,Calculate Revenue )

## User Classes and Characteristics

 User Of project include customers and staff
 Customer can be Member or visitor who are accessing this system.
 Staff Which act as administrator and controlling overall system
 User Should IT literate And know to use computer
 Cashier Should Know Data entry & Typing
 Manager should have knowledge of Internet & Browsing

## Operating environment

 This project will be operating in windows environment.Also compatible with internet explorer.
 The only requirement For using this project is having machine. Design and implemention constraints
 This project is developed using java.on the back hand for database we are using Sql server. The product is Accomplished With the login facility for user. User Documentation
 This project will include a user manual. The user manual include Complete overview of the producst,Configuration of the Tool used (Sql Sever or other), technical details, backup procedures and Contact Information which will include email address and Ph# .

## Hardware Requirements

 Processor : 1.6 Ghz and above
 RAM : 1GB RAM
 Monitor : 1 5”Colour Monitor
 Processor Speed : 1.7 GHZ
 Hard disk : 10GB HDD
 CD Drive : 52- X CD ROM
 Keyboard : Mercury 110 Keys
 Mouse : Logitech Mouse

## System Features Description & priorirty

 Proposed Database is intended to store, retrieve, update, andmanipulate information related to Chemist which include
 Order Processing & taking
 Staff information
 Customer Bill Detail
 Product Details
 Calculation of Revenue
 Searching of product
 Generate Reports

## Functional Requirements

 The software must allow input of products data from Administrator& secured access at , and from the data streaming real-time monitoring equipment
 The project must require high levels of error correction and input validation.
 The project must allow browsing by the Director&Staff of Cms To Acces And update information products & Customers ,vendors.
 The project must identify the Products & Customer by a unique numeric identifier derived from a function performed on the Customer’s birth date or product Id;

## Safety Requirements

 The Database may get crashed or damaged due to some viruses or operating system requirements.Therefore Its is mandatory to have backup for your data.Ups/inverter facility should be there in case of power failure.
 Security Requirements
 System will use secure Database
 Staff can just see the products & mark their attendance .They Cannot edit or,modify anything except their personal information.
 Proper user Authentication Will be provided.
 There should be separate account for Admin &user. So that no one else can access the database except Admin.

## User requirements

 The User Of system are Staff ,Managers and customer of the store.
 The members share assumed to have basic knowledge of computer & internet browsing While administrator of system should have more knowledge so he/she can resolve small problems and perform information’s.
 The user manual ,installation guide and other related material should be sufficient to educate the user how to use and maintain the system.

 DFD 0 Level
 DFD 1 level

Type of ExpensesWithout SoftwareWith Software
Employee  6 3
-- Cost (per year)5,76,0002,88,000
Software Exp.(One time)02,00,000
Medicine Expiry
(Per Year)
250000-1000
Customer Return
(No Stock)
25000 0-1000
Management
Maintenance
800020000
Purchase Order,
Gov. Reports,
Stock Analysis
2 hour daily
730hrs/year
= 36500INR
@ 50INR/hr
5 min
= 29.9hrs/yr
= 1460INR
@ 50INR/hr
List of Customers (Value)0(-10000)
Total(First Year)6,70,500/year501460/year

## In terms of : Medical store management system

This project is used mainly for medical stores to maintain the details of medical store such as stock and account.

This medical shop management software is so designed as to ease the work load of medical shop professionals.

The main feature includes inventory and stock control, accounting, customer management.

> Less staff
> Easy Payments
> Less Expired Products
> Autmoatic Stock Re-Order level, no need see all the shelf for the medicine stock.
> One click report of expiring soon medicine, minimum stock

There is different way to look for the system costs.

1. Hardware needs in shop: Facilitate computer, printer and related equipment to shop (~ 1.2 lac)
2. Software cost:
1. Analysis Cost
2. Designing Cost
3. Developing Cost
4. Testing Cost
5. Installing and maintenance

### Reason why we choose Waterfall model over Spiral Model

Spiral model
1. Good for large and critical projects
2. Working software is produced early during the lifecycle.
3. Large amount of risk analysis.
As we know we don't have a big project, so there is no need to use Spiral Model

### Reason why we choose Waterfall model over Iterative Model

Iterative Model

1. Produces working software early during the life-cycle.
2. More flexible as scope and requirement changes can be implemented at low cost.
3. Testing and debugging is easier, as the iterations are small.
4. Low risks factors as the risks can be identified and resolved during each iteration.

## SDLC Model: Waterfall Model

#### Reason of selecting Waterfall Model

ØThis is Small Project and Requirements Are Well understood.
ØThis model is simple and easy to understand and use.

ØWaterfall model is simple to implement and also the amount of resources required for it are minimal.
ØIt is easy to manage due to the rigidity of the model – each phase has specific deliverables and a review process.

# Data Mining: What is Data Mining?

### Overview

Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

### Continuous Innovation

Although data mining is a relatively new term, the technology is not. Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market research reports for years. However, continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy of analysis while driving down the cost.

### Example

For example, one Midwest grocery chain used the data mining capacity of Oracle software to analyze local buying patterns. They discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few items. The retailer concluded that they purchased the beer to have it available for the upcoming weekend. The grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.

### Data, Information, and Knowledge

#### Data

Data are any facts, numbers, or text that can be processed by a computer. Today, organizations are accumulating vast and growing amounts of data in different formats and different databases. This includes:
• operational or transactional data such as, sales, cost, inventory, payroll, and accounting
• nonoperational data, such as industry sales, forecast data, and macro economic data
• meta data - data about the data itself, such as logical database design or data dictionary definitions

#### Information

The patterns, associations, or relationships among all this data can provide information. For example, analysis of retail point of sale transaction data can yield information on which products are selling and when.

#### Knowledge

Information can be converted into knowledge about historical patterns and future trends. For example, summary information on retail supermarket sales can be analyzed in light of promotional efforts to provide knowledge of consumer buying behavior. Thus, a manufacturer or retailer could determine which items are most susceptible to promotional efforts.

#### Data Warehouses

Dramatic advances in data capture, processing power, data transmission, and storage capabilities are enabling organizations to integrate their various databases into data warehouses. Data warehousing is defined as a process of centralized data management and retrieval. Data warehousing, like data mining, is a relatively new term although the concept itself has been around for years. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data. Centralization of data is needed to maximize user access and analysis. Dramatic technological advances are making this vision a reality for many companies. And, equally dramatic advances in data analysis software are allowing users to access this data freely. The data analysis software is what supports data mining.

### What can data mining do?

Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among "internal" factors such as price, product positioning, or staff skills, and "external" factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables them to "drill down" into summary information to view detail transactional data.
With data mining, a retailer could use point-of-sale records of customer purchases to send targeted promotions based on an individual's purchase history. By mining demographic data from comment or warranty cards, the retailer could develop products and promotions to appeal to specific customer segments.
For example, Blockbuster Entertainment mines its video rental history database to recommend rentals to individual customers. American Express can suggest products to its cardholders based on analysis of their monthly expenditures.
WalMart is pioneering massive data mining to transform its supplier relationships. WalMart captures point-of-sale transactions from over 2,900 stores in 6 countries and continuously transmits this data to its massive 7.5 terabyte Teradata data warehouse. WalMart allows more than 3,500 suppliers, to access data on their products and perform data analyses. These suppliers use this data to identify customer buying patterns at the store display level. They use this information to manage local store inventory and identify new merchandising opportunities. In 1995, WalMart computers processed over 1 million complex data queries.
The National Basketball Association (NBA) is exploring a data mining application that can be used in conjunction with image recordings of basketball games. The Advanced Scout software analyzes the movements of players to help coaches orchestrate plays and strategies. For example, an analysis of the play-by-play sheet of the game played between the New York Knicks and the Cleveland Cavaliers on January 6, 1995 reveals that when Mark Price played the Guard position, John Williams attempted four jump shots and made each one! Advanced Scout not only finds this pattern, but explains that it is interesting because it differs considerably from the average shooting percentage of 49.30% for the Cavaliers during that game.
By using the NBA universal clock, a coach can automatically bring up the video clips showing each of the jump shots attempted by Williams with Price on the floor, without needing to comb through hours of video footage. Those clips show a very successful pick-and-roll play in which Price draws the Knick's defense and then finds Williams for an open jump shot.

### How does data mining work?

While large-scale information technology has been evolving separate transaction and analytical systems, data mining provides the link between the two. Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries. Several types of analytical software are available: statistical, machine learning, and neural networks. Generally, any of four types of relationships are sought:
• Classes: Stored data is used to locate data in predetermined groups. For example, a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order. This information could be used to increase traffic by having daily specials.
• Clusters: Data items are grouped according to logical relationships or consumer preferences. For example, data can be mined to identify market segments or consumer affinities.
• Associations: Data can be mined to identify associations. The beer-diaper example is an example of associative mining.
• Sequential patterns: Data is mined to anticipate behavior patterns and trends. For example, an outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a consumer's purchase of sleeping bags and hiking shoes.
Data mining consists of five major elements:
• Extract, transform, and load transaction data onto the data warehouse system.
• Store and manage the data in a multidimensional database system.
• Analyze the data by application software.
• Present the data in a useful format, such as a graph or table.
Different levels of analysis are available:
• Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure.
• Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of natural evolution.
• Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. They provide a set of rules that you can apply to a new (unclassified) dataset to predict which records will have a given outcome. CART segments a dataset by creating 2-way splits while CHAID segments using chi square tests to create multi-way splits. CART typically requires less data preparation than CHAID.
• Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1). Sometimes called the k-nearest neighbor technique.
• Rule induction: The extraction of useful if-then rules from data based on statistical significance.
• Data visualization: The visual interpretation of complex relationships in multidimensional data. Graphics tools are used to illustrate data relationships.

### What technological infrastructure is required?

Today, data mining applications are available on all size systems for mainframe, client/server, and PC platforms. System prices range from several thousand dollars for the smallest applications up to $1 million a terabyte for the largest. Enterprise-wide applications generally range in size from 10 gigabytes to over 11 terabytes. NCR has the capacity to deliver applications exceeding 100 terabytes. There are two critical technological drivers: • Size of the database: the more data being processed and maintained, the more powerful the system required. • Query complexity: the more complex the queries and the greater the number of queries being processed, the more powerful the system required. Relational database storage and management technology is adequate for many data mining applications less than 50 gigabytes. However, this infrastructure needs to be significantly enhanced to support larger applications. Some vendors have added extensive indexing capabilities to improve query performance. Others use new hardware architectures such as Massively Parallel Processors (MPP) to achieve order-of-magnitude improvements in query time. For example, MPP systems from NCR link hundreds of high-speed Pentium processors to achieve performance levels exceeding those of the largest supercomputers. Problem Detail: Let$M$be a square matrix with entries that are$0$or$1$and let$v$be a vector with values that are also$0$or$1$. If we are given$M$and$y = Mv$, we can computer$v$if$M$is non-singular. Now let us take the second bit (from the right) of the binary representation of each$y_i$as another vector$z$. So$z$also has entries which are$0$or$1$. If$y_i$has fewer than two bits we just let$z_i=0$. If we are given$z$and$M$, how (and when) can you find a$v$so that$Mv$would produce$z$under this operation? Here is an example $$M = \begin{pmatrix} 0 & 0 & 1 & 0\\ 1 & 1 & 0 & 1\\ 1 & 1 & 1 & 0\\ 0 & 1 & 1 & 1\\ \end{pmatrix} , v = \begin{pmatrix} 0 \\ 1 \\ 1 \\ 1\\ \end{pmatrix} \implies Mv=\begin{pmatrix} 1 \\ 2 \\ 2 \\ 3\\ \end{pmatrix} .$$ So in this case $$z = \begin{pmatrix} 0 \\ 1 \\ 1 \\ 1 \\ \end{pmatrix}.$$ Is this problem in fact NP-hard? ###### Asked By : Lembik #### Answered By : Rick Decker This problem can be cast into a familiar form: it's an example of what is known as a 0-1 linear programming problem. You have a set of variables$v_1, v_2, \dots v_n$subject to constraints of the form$m\le a_1v_1+a_2v_2+\cdots +a_nv_n\le M$. In your particular problem, we also have$a_i \in \{0, 1\}$. You are looking for any feasible solutions, namely tuples$(v_1, \dots, v_n)$which satisfy all the constraints. For example, suppose we have $$M = \left( \begin{array}{cccc} 0 & 0 & 1 & 0\\ 1 & 1 & 0 & 1\\ 1 & 1 & 1 & 0\\ 0 & 1 & 1 & 1 \end{array} \right), \quad{\text{and}}\quad z= \left( \begin{array}{c} 0\\1\\0\\1 \end{array}\right)$$ then you want $$v= \left( \begin{array}{c} a\\b\\c\\d \end{array}\right)$$ such that$Mv$will be, in binary (with the lower-order bit unspecified) $$Mv=\left(\begin{array}{c} c \\ a+b+d \\ a + b + c \\ b + c + d \end{array}\right) =\left(\begin{array}{c} \mathtt{0\_}\\ \mathtt{1\_}\\ \mathtt{0\_}\\ \mathtt{1\_} \end{array}\right)$$ From this we have the constraints $$\begin{array}{c} 0\le a, b, c, d\le 1\\ 2\le a + b + d\le 3\\ 0\le a + b + c\le 1\\ 2\le b + c + d\le 3 \end{array}$$ and you want to find$a, b, c, d$such that all the constraints are simultaneously satisfied. We've moved into 0-1 LP land because there are established procedures for solving problems like this, though sadly there's nowhere near enough room in this post to introduce them, so you'll have to do the legwork yourself. Be aware that problems like this are very compute-intensive (by which I mean NP-hard) and as far as I know you're not going to get anything like an elegant formulation of the form "This can be solved if and only if$z$is of the form ..." By the way, the example I used does indeed have a solution,$v$, and in this particular case is unique, though in general you'll have several solutions (if there are any at all). ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/23428 3.2K people like this Problem Detail: I'm ordering functions by their asymptotic growth for an assignment and I have verified I have the correct order by using limits, but I'm trying to understand why$n^{log\ log\ n}$is between$n^3$and$2^n$. It seems like it should be closer to the order of$n^n$. Can anyone provide some insight on this. ###### Asked By : realgenob #### Answered By : Yuval Filmus It could help if you wrote$n^{\log\log n}$as$2^{\log n\log\log n}$. Then it should be clear that$2^{\log n\log\log n} = o(2^n)$. The morale of this is that the base of an exponent doesn't count as much as the exponent itself. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/23496 3.2K people like this Problem Detail: I have this nondeterministic pda: $$\Sigma= \{a,b,c\}$$ and $$L=\{\omega\ \epsilon\ \Sigma^*\ |\ \omega\ = \alpha\beta\beta^R\gamma\ and\ \alpha,\beta,\gamma\ \epsilon\ \Sigma^*\ and\ |\beta|\ >3 \}$$ So once i have create the NPDA, i have to calculate the probability of accepting a correct word, i know it depends on the size of$\alpha$and the "free" jumps ($\varepsilon,\varepsilon->\varepsilon$). My problem is that i can't find the exact function of probability can someone explain me how to do it? Thanks. ###### Asked By : Curious Bystander #### Answered By : Hendrik Jan In general when constructed in a natural way, you have to guess the correct position of$\beta$, so you have to have two positions right. But there is a catch. In fact$L$can alternatively be defined as$\{\omega \mid \omega = \alpha\beta\beta^R\gamma \text{ and } |\beta|=4 \}$. This is seen as folows: if you find any substring$\beta\beta^R$then the central part of length 4+4=8 will also satisfy the palindromic substring constraint. So,$L$is regular, and one can find a DFA for it. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/18464 3.2K people like this Problem Detail: Is$L = \{w_1cw_2 : w_1, w_2 \in \{a, b\}^* , w_1 \neq w_2 \}$a CFL? In my opinion it is not since if we want to know the inequality of$w_1$and$w_2$we must be aware of their equality and that is not a$CFG$. ###### Asked By : Hadi Amiri #### Answered By : Denis This is a CFL, but the complement (with equality) is not. Notice that both the language and the complement become CFL when you mirror one of the two words. To show that this is a CFL, you can design a PDA that guesses the letter at position$i$that will be different in$w_1$and$w_2$. It start by reading the first$i$letters of$w_1$and pushing$i$symbols to the stack up to this letter, and remembers the letter$a$in$w_1$at position$i$. Then, it waits to read$c$, and pops the$i$symbols off the stack. It accepts if the current letter (at position$i$in$w_2$) is not$a$. You have to also treat the case of different lengths, but it's not hard, I leave it as exercise ;) ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/18774 3.2K people like this Problem Detail: In computer vision, scales are important when we carry out a scene analysis. Choosing different scales affect the result of the analysis. For example, if a face is relatively small in the scene, then the details including nose, eyes will be omitted. On the other hand, details on larger faces become relatively more salient. I know both Gaussian Blur with different sigmas and Down Sampling on the image can generate different scales. Which is more reasonable on a cognitive sense? ###### Asked By : Strin #### Answered By : Josh Vander Hook Down sampling may discard relevant features, while blurring should not. As a toy example, a down sample may remove a pixel which is a local maxima, while a blur operation will preserve the maxima by increasing the values of nearby pixels. If the local maxima corresponds to an interesting feature, it may still be discernible by the human eye after blurring. From a computational sense, Laplacian pyramids are able to reconstruct an image precisely because a blur-subtract operation preserves the "information" in the scene. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/3310 3.2K people like this Problem Detail: Why are blocking artifacts serious when there is fast motion in MPEG? Here is the guess I made: In MPEG, each block in an encoding frame is matched with a block in the reference frame. If the difference of two blocks is small, only the difference is encoded using DCT. Is the reason blocking artifacts are serious that the difference of two blocks is too large and DCT cut the AC component? ###### Asked By : Bear #### Answered By : Danny Varod DCT is both discrete and finite, it has a limited range in both the space and frequency domains. If the change in a certain location is large or the location changes a lot, the results of the transform will be beyond the sampling range in the frequency domain. Since as you mentioned MPEG is based on blocks, the result of a block with with missing high frequencies, after transform back to the spacial domain, will result in the loss of detail in that block - making the block look flat. For a more accurate result try Signal Processing Stack Exchange. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/1413 3.2K people like this Problem Detail: An imaginary processor has the following hardware specification: • 8bit data bus • 12bit address bus • 32 × 8bit general purpose registers e.g. S0S1F Briefly describe what bit fields are required within an instruction to encode the following functionality: 1. 56 different instructions 2. Register addressing e.g. ADD S0, S1, add the contents of register S1 to register S0, store the result in register S0. 3. Immediate addressing e.g. ADDI S0, 10, add the constant 10 (base 16) to register S0, store the result in register S0. 4. Absolute addressing e.g. ADDA S0, 100, add the data stored in external memory address 100 (base 16) to register S0, store the result in register S0 5. If the processor uses a fixed length instruction format, briefly describe how many bits are required to represent an instruction and the bit fields used. For (1), I know it's$\log_2 56$round up to 6 bits but for (2), I know the answer is 6bits+5bits+5bits, but I can't figure out why. ###### Asked By : tmvnty #### Answered By : Ran G. An "add" instruction (e.g., ADD S0 S1B) must have 3 parts of information: 1. which instruction to do ("ADD") 2. Which is the input register (S0) 3. Which is the output register (S1B) How many bits does each part take? Well, you correctly answered that the first part is 6 bit. Can you see why the second and third parts take 5 bits each? For the other parts of the question, try to split to the information the instruction must have, and analyze the amount of bits each part takes, under the definitions of the specific machine in use. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/26277 3.2K people like this Problem Detail: Can someone help me to store the factorial of large numbers such as 100! efficiently? UPDATE: obviously, storing the argument rather than the factorial digits themselves achieves a significant saving. The true challenge is to find a data compression scheme that achieves a significant compression ratio, but with a lighter computational complexity than recomputing the factorial(s) from the argument. More precisely, can you design an algorithm to produce all decimal digits of$n!$, for$n\le N$, using$o(N\log N!)$storage and small computational cost, such as the lower bound$O(\log n!)$. ###### Asked By : LUSer #### Answered By : Yuval Filmus We can calculate 100! = 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000. You can store this in any way you want: as 100!, as the ASCII string 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000, as a Unicode string, or in binary in any number of formats. It all depends on what you want to do with the value. If you want to display the value on screen as a decimal, you might as well store the decimal string. If you want to do arithmetic, then you should be using a library for dealing with "big numbers" such as GMP, and in this case the library will provide its own means for storing numbers. If you're just interested in storing the expression 100!, and intend to use it in a computer algebra software (CAS), you might as well store it as just 100!. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/26483 3.2K people like this Problem Detail: I have a contiguous ordered data structure (0 based index): x= [1/3, 1/3, 1/3]  Let's say I selected index 1 and increased the probability by 1/3. Rest of the probabilities each decrease by 1/6 and the total probability remains P = 1. x= [1/6, 2/3, 1/6]  Let's say I selected index 2 and increased the probability by 1/3. Rest of the probabilities in total need to decrease by 1/3 to make the total probability remain P= 1. x= [1/10, 2/5, 1/2]  Is there a name for this kind of data structure? I'd like to research that name and use a library instead of my custom rolled code if possible. ###### Asked By : pwned #### Answered By : A.Schulz This can be easily achieved by an array$A$and an additional variable$m$. The entry$A[i]$stores the probability of$i$times$m$. In your example you start with A=[1,1,1], m=3  The you increase the probability of the second element by 1/3. This can be carried out by setting$A[2]=4$and$m=6$, which gives A=[1,4,1], m=6  In general you could set the probability of element$k$to$p$by solving the system $$\sum_i A[i] = m \text{ and } A[k]=p\cdot m,$$ for the unknowns$A[k]$and$m$. So all you need to do is to set$m$to $$m = \frac{\sum_{i\neq k} A[i]}{1-p},$$ and$A[k]=p\cdot m$for the new$m$. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/6741 3.2K people like this Problem Detail: I am reading Algorithms 4th edition by Robert Sedgewick and am stumped at a particular problem. On page 460 of the book the author is describing a technique to hash strings and use prime numbers for modular hashing. Starting from the fourth line:  If R is greater than any character value, this computation would be equivalent to treating the String as an N-digit base-R integer, computing the remainder that results when dividing that number by M.  Where the algorithm being used to hash a N-character long String is: int hash = 0; for (int i = 0; i < s.length(); i++) hash = (R * hash + s.charAt(i)) % M;  How does the above algorithm result in a base-R integer? Here M and R are integers (random) and s is the string of length N. I am assuming that by the 'above computation' the author means the computation within the parentheses => R*hash + s.charAt(i) ###### Asked By : Deepak #### Answered By : Deepak The iteration results in : (R^2*C + R^1*B + R^0*A + some constant)mod(M) for the string ABC, the value inside the brackets is a base 16 representation of CBA plus some constant value. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/37194 3.2K people like this Problem Detail: Usual parallel algorithms require a number of processors that grow with the size of the input, usually$O(\log n)$or$O(n)$. This is great from a theoric point of view, but strange from a practical point of view, where the number of the processors that we can use is bounded by a constant. A corollary of the Brent Theorem allows us to simulate a parallel machine with less processors, but doesn't gurantee that the algorithm is still optimal. Is there a branch of parallel algorithms that studies parallel algorithms that run on a number of processors bounded by a constant? As commented below: If the number of processors is bounded by a constant, then any sequential algorithm will be asymptotically optimal. I think that this fact implies that asyntotical optimality is not the best way to measure performances in real world parallel algorithms. Is there another measure, like speedup or cost (defined as execution time multiplied by the number of processors) that can tell us how faster is an algorithm that runs on a constant number of processors? A related question is: How to scale down parallel complexity results to constantly many cores? ###### Asked By : Vitalij Zadneprovskij #### Answered By : Wandering Logic There are several models, widely used in practice, that might be of interest to you. Almost all of them could be viewed as generalizations of Amdahl's law. They take two parameters,$P$(number of processing elements) and$N$size of the input. Several conferences in which you are likely to find these types of analyses are the ACM Symposium on Parallel Algorithms and Architectures (SPAA), and the ACM SIGPLAN Symposium on the Principles and Practice of Paralllel Proogramming (PPoPP). The goal with these models is generally to understand the tradeoff, for a specific algorithm, between the increased parallelism available from larger$P$with the increased synchronization and communication costs that come along with larger$P$. The first one that got major attention was Valiant, Leslie G: A bridging model for parallel computation, Communications of the ACM, 33(8):103-111, 1990. Valiant assumes that your algorithm moves between phases of completely parallel work, a phase of barrier synchronization during which the processors communicate, and then the next phase of completely parallel work. For each computation phase you assume it is going to take time inversely proportional to$P$. For the synchronization phases you usually want to take into account load imbalance, and the latency of the communication phases is going to depend both on the structure of your communication network and how good a job you can do at minimizing the distance between communicating processors. The paper Culler, D; Karp, R; Patterson, D; Sahay, A; Schauser, KE, Santos, E; Subramonian, R; von Eicken, T:$LogP$: Towards a Realistic Model of Parallel Computation, ACM Symp Principles and Practice of Parallel Programming, (PPOPP-4):1-12, 1993. built on Valiant's model by carefully taking into account the$L$atency between nodes, the$o$verhead of sending a message, and the$g$ap required between send operations (a measure of bandwidth), in addition to$P$. This is too detailed for some uses, but can be useful in cases where it is important for the algorithm designer to choose between many small messages and fewer long messages. Finally I'll mention Blumofe, Robert D; Leiserson, Charles E: Scheduling Multithreaded Computations by Work Stealing, IEEE FOCS, 1994. which gives a small generalization of Amdahl's law, pointing out that execution time is bounded (below) by$T_P = O(T_1/P + T_\infty)$, where$T_1$is the minimum serial execution time and$T_\infty$would be the minimum execution time with an infinite (i.e., arbitrarily large) number of processors and no communication delays. (Another way to look at$T_\infty$is that it is the critical path through the algorithm.) ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/39899 3.2K people like this Problem Detail: My semidefinite program amounts to two constraints,$L_1 = 0$and$L_2 = 0$where$L_i$are linear functions of my variables$x_{ij}$with the additional constraint that the$x_{ij}$matrix is positive semidefinite. I see no way that this program would be infeasible, because just setting every variable to 0 would satisfy all the constraints. I have written the semidefinite program according to SPDA format. In this format, my SDP is the dual program. When I solve it with the software csdp, it tells me that the "dual program is infeasible." Here is the particular SDP: 2 1 11 0.0 0.0 0 1 1 10 1.0 1 1 1 10 .25 1 1 3 10 .25 1 1 6 10 -.25 1 1 8 10 -.25 1 1 9 10 -.5 2 1 2 11 -3.0 2 1 3 11 -4.0 2 1 4 11 1.0 2 1 5 11 1.0 2 1 6 11 -4.0 2 1 7 11 3.0 2 1 9 11 1.0  csdp outputs This is a pure dual feasibility problem. Iter: 0 Ap: 0.00e+000 Pobj: 0.0000000e+000 Ad: 0.00e+000 Dobj: 0.0000000e+000 Iter: 1 Ap: 1.00e+000 Pobj: 5.6521881e+000 Ad: 9.00e-001 Dobj: 0.0000000e+000 Declaring dual infeasibility. Success: SDP is dual infeasible Certificate of dual infeasibility: tr(CX)=1.00000e+000, ||A(X)||=1.38778e-017  ###### Asked By : Mark #### Answered By : Brian Borchers The original poster seems to misunderstand the distinction between "primal infeasible" and "dual infeasible." The output from CSDP shows that the problem is primal feasible ($X=0$is a primal feasible solution) but that the primal problem is unbounded. By weak duality, the corresponding dual problem is infeasible. If you just want a primal feasible solution and don't care about the primal objective value, then you can get this in a couple of ways: 1. You can use the "certificate" returned by CSDP. This is a matrix$X$such that$X$is positive semidefinite and$A(X)=0$. Any positive multiple of this matrix is a primal feasible solution to your SDP. 2. You can add an additional constraint that causes the objective function to be bounded. A simple choice would be trace(X)=100. By the way, it's easy to confirm this using another SDP solver such as SeDuMi or SDPT3 or SDPA. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/33478 3.2K people like this Problem Detail: i prepare for Autotmata Course Final Exam. in one of lecture, our professor draw this Turing Machine, and wrote DELTA is Neutral element of TM. it'w wrote: Language of this TM is: {$W \in (a+b)^* : w=w^R$} but i think x is in alphabet and the language is {$W \in (a+b+x)^* : n_a(w)=n_b(w)$}$n_a$is number of a's in w. anyone could add some detail or hint for this note and which of them is correct? ###### Asked By : Mina Simin #### Answered By : Hendrik Jan Look at the instructions like$a/x,R$in the diagram. They are used to "mark" the letters that have been compared (for every$a$marked, we also mark a$b$). The TM repeatedly marks a pair of different symbols, and then restarts at the left side of the tape. The machine works as follows. It starts at the leftmost symbol of a tape. It skips all$x$'s (in$q0$moving right) and looks for the first$a$or$b$, moving to$q1$and$q2$respectively (to distinguish cases). In that state it looks for the other symbol skipping$x$'s, and$a$'s or$b$'s depending on the letter we found first. At that point (in$q3$) it returns to the left of the tape (moving left until a blank$\Delta$is found) and repeats. Acceptance if in$q0$we only find$x$'s and reach the blank$\Delta$to the right of the tape. This indicates that$x$is a special symbol from the tape alphabet, but not the input alphabet, and is used for administration during the computation. The original language is correct:$\{ w\in \{a,b\}^* \mid n_a(x) = n_b(x) \}$. Technically it could be the case that$x$is also in the input alphabet. However what your professor stated is consistent with the diagram. A language over$\{a,b\}$, and an extra tape symbol$x$to mark symbols that have been accounted for. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/29315 3.2K people like this Problem Detail: Given pseudocode: X is an array function FUN(X) l = length(X) if l <= 1 then PRINT("Hello") return end if p = l / 3 FUN(X[1...p]) FUN(X[(l-p+1)...l]) end function  I now want to get the reccurence of this pseudocode. At the moment I have $$T(n) = 2 \cdot T(\frac{n}{b})\cdot O(1)$$ but the problem is that I don't know how to handle the fact that the first recursive call has a subproblem size of$\frac{1}{3}$and the second has a subproblem size of$\frac{2}{3}$. What is$\frac{n}{b}$then? ###### Asked By : Basti Funck #### Answered By : Rick Decker Your timing function is$T(n)=2T(n/3)+1$, namely two recursive calls to one-third size problems plus a constant amount of time for the if test and the assignments. If you know the Master Theorem, the answer is immediate, namely$T(n)=\Theta(n^{\log_32}). If you don't know the MT, it's not too difficult to see the answer by expansion: \begin{align} T(n) &= 2T\left(\frac{n}{3}\right)+1\\ &= 2\left(2T(\frac{n}{3^2})+1\right)+1=2^2\ T\left(\frac{n}{3^2}\right)+2+1\\ &=2^3\ T\left(\frac{n}{3^3}\right)+4+2+1 &\text{and, in general, we have}\\ &\dotsc\\ &=2^k\ T\left(\frac{n}{3^k}\right)+\left(2^{k-1}+\dotsm +4+2+1\right) \end{align} Now letn=3^k$and the result follows with a little algebra. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/40234 3.2K people like this Problem Detail: Suppose we have an algebraic specification in the form:$\{S,F,w\}$where$S$are the sorts,$F$are the functions and$w$are the set of equations. For example, the specification for natural numbers: •$S = \{\mathrm{int}\}$•$F = \{ 0: \mathrm{int}, \; \mathrm{succ} : \mathrm{int}\rightarrow\mathrm{int}, \; \mathrm{pred}: \mathrm{int}\rightarrow\mathrm{int} \}$•$w = \{ \mathrm{succ}(\mathrm{pred}(x)) = x, \; \mathrm{pred}(\mathrm{succ}(x)) = x \}$My question is, why and where do we need homomorphisms and isomorphisms in this case? How do homomorphisms and isomorphisms look like between algebras ? ###### Asked By : M.M #### Answered By : Pål GD Can you come up with two different algebras, say, one where the domain is$\mathbb{N}$and one where the domain is$\{0,1\}$and in the former, suc and pred work as you would assume, and in the latter, they are modulo 2 operations? Then, try to come up with a homomorphism from one to another. Then, try to make an algebra, where the domain is$\{0, s0, ss0, sss0, \dots\}$and define suc and pred as you would guess. Make an isomorphism from this one to the one with$\mathbb{N}$as domain. Finally, you can make the "term algebra", which has as domain all strings that are valid "terms", i.e.: "0" is in the domain, and for every element$t$in the domain, "suc($t$)" is in the domain, and "pred($t$)" is in the domain. Their interpretation is simply that the constant 0 maps to the string "0". The term pred(suc(suc(suc(0)))) maps to the string "pred(suc(suc(suc(0))))". Now, you might have a hard time making an isomorphism from this to the standard algebra (the one with$\mathbb{N}$), since "0" and "pred(suc(0))" should both map to 0. I'm not sure exactly what you ask for, but these two tasks should get you started, at least. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/9005 3.2K people like this Problem Detail: It's known that for$f(n) \geq \log n$,$\mathsf{NSPACE}(f(n)) = \mathsf{coNSPACE}(f(n))$. What if$f(n)<\log n$? Are they also equal? ###### Asked By : URL87 #### Answered By : Reza Immerman–Szelepcsényi theorem works under logarithmic reductions. Sublogarithmic space classes have different reductions. The TMs working within sublogarithmic space cannot even record the the length of the input. It has been shown that the space hierarchy for any sublogarithmic bound in Ω(log log n)-- o(log n) is inﬁnite. You can find it in following references: V. Geffert. Sublogarithmic σ2-space is not closed under complement and other separation results. Theoretical Informatics and Applications, 27:349–366, 1993. M. Liśkiewicz and R. Reischuk. Separating the lower levels of the sublogarithmic space hierarchy. In Proceedings of the Symposium on Theoretical Aspects of Computer Science, pages 6–27, 1993. B. von Braunmuhl "Alternation for two-way machines with sublogarithmic space". In Proceedings of the Symposium on Theoretical Aspects of Computer Science, pages 5–15, 1993. The paper The complexity world below logarithmic space by M. Liśkiewicz and R. Reischuk contains an excellent wrap up of the sublogarithmic space. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/9625 3.2K people like this Problem Detail: $$a_i * x_i + b_i * x_j = N$$$\text{given } a_i, b_i, N \in \mathbb{N}$i want to find a solution where$x_i, x_j \in \mathbb{N}$or output if there exists none. What is the fastest way to calculate this? ###### Asked By : user3613886 #### Answered By : Tom van der Zanden We can write$N$as a linear combination of$a$and$b$if and only if$N$is a multiple of$d=\gcd(a,b)$. Using the extended Euclidean algorithm we can find integers$x,y$such that$ax+by=d$and this in turn yields a solution of the form$\frac{N}{d}(ax+by)=N$. Now$\frac{Nx}{d}$and$\frac{Ny}{d}$will not necessarily be positive. However, for all integer$k$,$a\frac{Nx+bk}{d}+b\frac{Ny-ak}{d}=N$will also be a solution. To determine whether the equation has any natural solutions, we must simply determine the range of$k$for which$Nx+bk$is positive and the range for which$Nx-ak$is positive and determine whether the ranges overlap. In the case with 3 unknowns,$ax_1+bx_2+cx_3=N$, note that$bx_2+cx_3$will always be a multiple of$\gcd(b,c)$. The problem thus comes down to solving$ax+y\gcd(b,c)=N$, picking an appropriate solution to that problem, and then solving$bx_2+cx_3=y\gcd(b,c)$. However it is not clear to me how to pick$y$so that the second problem can be solved. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/37254 3.2K people like this Problem Detail: Let L = {ab,aa,baa}.  I need to find L^4. From my understanding, I union the set. So: L^1 = {ab,aa,baa} L^2 = {abab,abaa,abbaa,aaab,aaaa,aabaa,baaab,baaaa,baabaa} L^3 = {abababab,abababaa,abababba,....} ?? ` I concatenate L^3 with L^2 as I did above correct? And then I concatenate L^4 with L^3? ###### Asked By : Question_Guy #### Answered By : Yuval Filmus The set$L^4$consists of all words$xyzw$where$x,y,z,w \in L$. In our case, there are (at most)$3^4 = 81$such words, but I don't see why you'd want to write all of them. If you want, you can also compute$L^4$by squaring$L$twice:$L^4 = (L^2)^2$. Then there is no need to compute$L^3$. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/30462 3.2K people like this Problem Detail: So I've been doing some reading on community detection in graphs as I'm planning on working on my thesis for it. I've been reviewing papers regarding the same and came across the Girvan-Newman algorithm. I've read the paper and have a doubt which I couldn't really figure out. The algorithm works by removing the edge which has the highest value of "edge betweenness" in every iteration. Suppose I have a graph$G(V, E)$such that$V = \{ v_{1}, v_2, \cdots , v_n\}$. Now suppose this graph is such that it has 2 distinct communities (something we already know but that's what the algorithm should detect). After the first iteration, it would remove the edge which has the highest value of edge betweenness. Now my question is, after this is done (or even after a few iterations), say I have any vertex$v_i$, how do we know which community this vertex belong to? Am I missing something here? ###### Asked By : muddy #### Answered By : Sasho Nikolov This is hierarchical clustering method. As you remove edges, at some point the graph will become disconnected: the connected components are your top-level communities. As you keep removing edges, the top-level communities become disconnected too, and you can think of the new smaller connected components as sub-communities. So you can think of the algorithm as building a tree of finer and finer communities, where the leaves are individual vertices. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/10411 3.2K people like this Problem Detail: What is a standard term for a matching in a bipartite graph, in which one part has less vertices than the other part, and the part with less vertices is fully matched (but the other part is, obviously, not fully matched)? It is not exactly perfect, but it is "one-sided perfect", or "as perfect as it could be". What is a short term to use for this kind of matching? ###### Asked By : Erel Segal-Halevi #### Answered By : Pål GD Such a matching is said to saturate one of the sides. More specifically, if$M$is a matching of a graph$G$and$v$is incident to one of the edges of$M$, we say that$M$saturates$v$. If$A$is a set of vertices, then we say that$M$saturates$A$if every vertex in$A$is saturated by$M$. Specifically for your case, the matching saturates the smaller of the two sides of the bipartition. ###### Best Answer from StackOverflow Question Source : http://cs.stackexchange.com/questions/37837 3.2K people like this Problem Detail: Is it Ok to express an NFA accepting 1*0 like the one above? All examples in books are represented like the bottom one. I'd like to know is that only a convention or it must have two extra nodes? ###### Asked By : pedim #### Answered By : Renato Sanhueza It is not a convention. You must know that a regular expression denotes one regular language and this language can be generated by an infinite number of differeny automatona (in this case we are talking about NFA). Why are you always encountering the second automaton in the books? This is probably because the second automaton is obtained from the regular expression$1^*0\$ using a well known method called Thompson's construction.

Which of the two automata is better? There isn't just one answer. The first NFA looks simpler but maybe you will have more work when you try to formally prove its correctness. In the other hand the second automaton looks more complex but it was obtained by a proven method so we are sure that it is correct.

Question Source : http://cs.stackexchange.com/questions/53115

3.2K people like this
Problem Detail:

What is an intuitive way of understanding what a push down automaton is capable of computing?