yes, it is. But will take memory allocation and not recommended.
7) What is table partition?
Table partitioning is a way to divide a large table into smaller, more manageable parts without having to create separate tables for each part. Data in a partitioned table is physically stored in groups of rows called partitions and each partition can be accessed and maintained separately.
Why partition?
Primarily, performance reasons. Because the table is split up, data can be retrieved in a faster and more organized way when the query optimizer sees that partition. WHERE clause BIT data type = 1 would search through only that partition. This cuts back seeks and scans dramatically.Partitioning is best used with an extremely high level of records in the data.
How?
Right click on table-> Storage->Create partition
choose the column to partition.
create partition function.
create partition scheme.
Decide between a Left or Right boundary. If you choose a Left boundary, your data will have a greater than or equals to condition for storing values in that partition or not. A Right boundary will be strictly a greater than condition.Choose a File Group to partition to. Ensure you have more than one File Group to partition to, 1 for your conditional boundary records and another for your non conditional records.Choose to either script out your selections to a code, run your partitioning immediately, or schedule out the partition in the future.
8) Why Backup and restore?
Backup- To protect databases against data loss caused by a variety of failures such as media failure, user errors, hardware failures, natural disasters etc. Test your strategy by restoring a set of backups and then recovering your database to prepare you to respond effectively to a disaster.
Restore-A multi-phase process that copies all the data and log pages from a specified SQL Server backup to a specified database, and then rolls forward all the transactions that are logged in the backup by applying logged changes to bring the data forward in time.
Advantages: maximizes data availability and minimizes data loss.
We can create and restore backups using SSMS and T-SQL.
The RPO (Recovery-Point-Objective) and RTO (Recovery-Time-Objective) are vital parameters to consider when planning the backup strategy.
The recovery point objective, or RPO, is the point in time that data can be restored to. recovery time objective, (or RTO) is the amount of time taken to perform the restoration.
9)Stored Procedures and functions
Functions:-Function will allow only Select statement, it will not allow us to use DML statements. Also, it will allow only input parameters, doesn’t support output parameters.Transactions are not allowed within functions. Transactions are not allowed within functions. Stored procedures can’t be called from function. it can be called from select statement. User defined functions can be used in join clause as a result set. Useful with specialized data types such as images and geometric objects.
Stored Procedure can have select statements as well as DML statements such as insert, update, deletes etc.It can have both input and output parameters. Transactions can be used in SP's. Table variables and temporary variables can be used. SP's can call functions. Procedures can’t be called from Select/Where/Having etc statements. Execute/Exec statement can be used to call/execute stored procedure. it can't be used in join clause. Prepared SQL code that you save so you can reuse the code repeatedly. To running the same SQL code repeatedly, we can pass parameters to the stored procedure, so depending on what the need is the stored procedure can act accordingly based on the parameter values that were passed.
Usage of Stored Procedures in SSIS - preparing tables (truncate), audit tasks (usually part of SSIS framework), getting configuration values for loops
10) difference between parameters and variables in ssis
In simple words
value that you pass is a PARAMETER
value that you declare is a VARIABLE
Parameters - The value of a parameter can be set at development time, after the package is deployed to the SSIS server (via SSMS) or at run time (SSMS or SS agent job).
Project: available to all packages within the SSIS project.(like global parameters).Project parameters are new with SSIS 2012. A project parameter can be shared among all of the packages in an SSIS project. You want to use a project parameter when, at run-time, the value is the same for all packages.
All parameters can be specified in project.params.
Usage - easy to share a given value, such as the path to a file share or the name of a server, across all packages in a project.
Package. These type of parameters are meant to affect only the package on which they were defined.Package parameters are also new with SSIS 2012. A package parameter is exactly the same as a project parameter – except that the scope of a package parameter is the individual package it resides in. You want to use a package parameter when, at run-time, the value is different for each package.
project parameters do *not* have an expressions property to define their value. They are intended to hold a literal value which does not change while the package executes.
Value of the parameter remains the same for the entire execution of the package.
Variables - Value of variable can be set in many ways within a package and can be referenced by other containers, expressions, tasks, components or event handler level. The value of variables can change during the execution of the package, which make them suitable for scenarios.
Eg: used in conditional, looping tasks
SSIS->VARIABLES
Parameters do not support the datatypes char, DBnull, object.
Within a package, SSIS variables have an expression property. The expression property, and the ability to change values during the execution of a package if needed, are two fundamental differences variables have from parameters. A variable can be scoped to the package or an object within the package.Variables often consume values from parameters within their expressions.Variables can also communicate values to objects like Execute SQL tasks, For Each containers, Send Mail tasks, and so on.
https://www.sqlchick.com/entries/2013/9/15/getting-started-with-parameters-variables-configurations-in.html
11) difference between "fuzzy lookup" and "fuzzy grouping"
The Fuzzy Grouping task performs the same operations as the Fuzzy Lookup task but instead of evaluating input records against an outside reference table, the input set becomes the reference. Input records are therefore evaluated against other records in the input set and evaluated for similarity and assigned to a group.
The very basic purpose of the Fuzzy Lookup is to match input data to a lookup table whose columns you are matching across that do not necessarily match exactly. The Fuzzy Lookup Transformation is therefore very similar to the Lookup Transformation, except you are not joining with identical values; you are joining with similar values.
12)Difference between timestamp, smalldatetime [4 byte storage, YYYY-MM-DD hh:mm:ss, accuracy 1 minute] , datetime[8 byte storage, YYYY-MM-DD hh:mm:ss.nnn, accuracy 0.0033 seconds], datetime2[6 to 8 byte storage, Accuracy 100 nanoseconds)- DateTime2 is the new Data Type introduced in Sql Server 2008 for storing Date and Time value.
WHY DATETIME2?
Use the time
, date
, datetime2
and datetimeoffset
data types for new work. These types align with the SQL Standard. They are more portable. time
, datetime2
and datetimeoffset
provide more seconds precision. datetimeoffset
provides time zone support for globally deployed applications.[MSDN ] datetime2 has larger date range, a larger default fractional precision, and optional user-specified precision. Also depending on the user-specified precision it may use less storage. (Ref: Stack Overflow)
DATETIME2 HAS
- larger range of values
- better Accuracy (100 nanoseconds)
- smaller storage space (6 to 8 bytes)
- SELECT SYSDATETIME() returns DB Current Date and Time of DateTime2 Data Type where as SELECT GETDATE() returns DB Current Date and Time of DateTime Data Type.
SELECT CURRENT_TIMESTAMP
2019-05-09 05:49:06.090
SELECT SYSDATETIME() AS DATETIME2
2019-05-09 05:49:06.0905358
SELECT GETDATE() AS DATETIME
2019-05-09 05:49:06.090
TIMESTAMP
SELECT
CURRENT_TIMESTAMP AS current_date_time;
current_date_time
2019-05-08 15:30:57.543
Examples:
DECLARE @dt DATETIME
SELECT @dt = '2019-05-08 23:22:12'
SELECT @dt AS [DateTime],
CAST(@dt AS DATE) AS [Date]
DECLARE @Date DATETIME
SET @Date = GETDATE()
PRINT @Date
May 6 2019 8:22PM
SELECT GETDATE()
2019-05-06 20:46:34.327
SELECT SYSDATETIME()
2019-05-06 20:47:13.1734039
DECLARE @Date DATETIME2
SET @Date = GETDATE()
PRINT @Date
2019-05-06 20:25:15.4330000
To get time from todays date
SELECT CONVERT ( TIME , GETDATE())
AS 'Current TIME using GETDATE()'
Current TIME using GETDATE()
22:30:14.5170000
SELECT CONVERT (DATE, GETDATE()) 'Date Part Only'
Date Part Only
2019-05-09
|
We cannot add or subtract days using DateTime2, but we can using DateTime. For DateTime2 , use DateAdd Function.
DECLARE @nowDateTime
DATETIME = GETDATE()
SELECT @nowDateTime + 1
2019-05-07 20:49:43.287
DateTime2 with fractional seconds precision of 3 is same as DateTime data type. And DateTime2(3) uses 7 bytes of storage instead of 8 byte which old DateTime datatype uses and it also provides higher date range (i.e. 0001-01-01 to 9999-12-31 ) compared to DateTime data type.
SELECT CONVERT(DATE,GETDATE())
2019-05-08
sp_help table will show all the details.
13) How to handle SQL server errors?
It includes the usage of common functions to return information about the error and using the TRY CATCH block in stored procedures and transactions.
14) Will Triggers affect performance of a table?
Yes. The database will have more work to do so performance will be impacted. But the impact may be unnoticeable to your end users. The trigger is mostly used for maintaining the integrity of the information on the database. triggers will affect performance in insert, update and delete operations. but doesn't affect read performance.
Triggers improve performance the most when they execute more SQL statements and the network speed is comparatively slow. When the database server executes an SQL statement, it must perform the following actions: Determine if triggers must be fired.
Magic tables in triggers
Usually, inserted and deleted tables are called Magic Tables.These tables are automatically created and managed by SQL Server internally to hold recently inserted, deleted and updated values during DML operations (Insert, Update and Delete) on a database table.
Inserted magic table
The Inserted table holds the recently inserted values, in other words new data values. Hence recently added records are inserted into the Inserted table.Deleted magic table
The Deleted table holds the recently deleted or updated values, in other words old data values. Hence the old updated and deleted records are inserted into the Deleted table.
15) maximum number of columns in a table?
Wide Tables are considered to be denormalized tables, Non-wide tables are considered to be Normalized tables. Wide table can contain 30,000 columns, Non-wide table(basic table) can contain only 1024 columns.
SQL Server supports a maximum of 1,024 Columns in a base table (non wide)
SQL Server supports a maximum of 30,000 Columns in a base table (wide)
Maximum number of triggers on a table-300
Clustered indexes per table-1
16) will insert work in functions?
no
17) will insert work in procedures?
yes
18) Second Highest Salary of an employee
select MAX(Salary) from Customer where Salary < (Select max(Salary) from Customer)
SELECT TOP 1 Salary from
(SELECT DISTINCT TOP 2 Salary from Customer ORDER BY Salary desc) T
ORDER BY Salary ASC
To get all details,
SELECT TOP 1* from (SELECT DISTINCT TOP 2* FROM Customer order by Salary DESC) T
order by Salary ASC
4th highest salary of employee?
SELECT TOP 1* from (SELECT DISTINCT TOP 4* FROM Customer order by Salary DESC) T
order by Salary ASC
nth highest salary of employee
SELECT TOP 1* from (SELECT DISTINCT TOP n* FROM Customer order by Salary DESC) T
order by Salary ASC
19) How many clustered index in one table?
1. But we can use more than one columns to make unique clustered index.
How many non clustered index in one table?
999- sql server 2008
249-sql server 2005
20) Can we compare datetime and timestamp columns?
No
21) Transformations in SSIS
DATACONVERSION: Converts columns data types from one to another type. It stands for Explicit Column Conversion.
DATAMININGQUERY: Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
DERIVEDCOLUMN: Create a new (computed) column from given expressions.
EXPORTCOLUMN: Used to export an Image specific column from the database to a flat file.
FUZZYGROUPING: Used for data cleansing by finding rows that are likely duplicates.
FUZZYLOOKUP: Used for Pattern Matching and Ranking based on fuzzy logic.
AGGREGATE: It applies aggregate functions to Record Sets to produce new output records from aggregated values.
AUDIT: Adds Package and Task level Metadata: such as Machine Name, Execution Instance, Package Name, Package ID, etc.
CHARACTERMAP: Performs SQL Server column level string operations such as changing data from lower case to upper case.
MULTICAST: Sends a copy of supplied Data Source onto multiple Destinations.This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple give one input and take many outputs of the same data.
CONDITIONALSPLIT: Separates available input into separate output pipelines based on Boolean Expressions configured for each output. It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split employee salary between less than 50000 and greater or equal to 50000. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.
COPYCOLUMN: Add a copy of column to the output we can later transform the copy keeping the original for auditing.
IMPORTCOLUMN: Reads image specific column from database onto a flat file.
LOOKUP: Performs the lookup (searching) of a given reference object set to a data source. It is used for exact matches only.
MERGE: Merges two sorted data sets into a single data set into a single data flow.Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but data should be sorted and Data type, data length and other Meta data attribute must be similar before merging.
MERGEJOIN: Merges two data sets into a single dataset using a join junction. Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.
ROWCOUNT: Stores the resulting row count from the data flow / transformation into a variable.
ROWSAMPLING: Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.
UNIONALL: Merge multiple data sets into a single dataset.
PIVOT: Used for Normalization of data sources to reduce anomalies by converting rows into columns
UNPIVOT: Used for de-normalizing the data structure by converts columns into rows in case of building Data Warehouses
22) What are Precedence Constraints?
A task will only execute if the condition that is set by the precedence constraint preceding the task is met. By using these constraints, it will choose different execution paths depending on the success or failure of other tasks.
- Success – Workflow will proceed when the preceding containter executes successfully. Indicated in control flow by a solid green line.
- Failure – Workflow will proceed when the preceding container’s execution results in a failure. Indicated in control flow by a solid red line.
- Completion – Workflow will proceed when the preceding container’s execution completes, regardless of success or failure. Indicated in control flow by a solid blue line.
- Expression/Constraint with logical AND – workflow will proceed when specified expression and constraints evaluate to true. Indicated in control flow by a solid color line along with a small ‘fx’ icon next to it. Color of line depends on logical constraint chosen (e.g. success = green, completion = blue).
A checkpoint is the property in SSIS which enables the project to restart from the point of failure. When we set the property to true package create the checkpoint file which stores the information about package execution and use to restart package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time the package runs.checkpoints do not save ForEach and For loops.t They’re like safeguards for the project, in case something doesn’t go according to plan.
24) Explain what breakpoints are and how you would use them.
Breakpoints put pauses in your package. It’s a great tool for debugging a package because you can place a break point on a task and it will pause the package based on execution events.
A reason in which I have used breakpoints is when I have a looping container and I want to see how my variables are changed by the loop. I would place a watch window on the package and type the variable name in. Set a break point on the container the stop after each iteration of the loop.
25) How do you handle errors in ssis?
When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.
Errors typically fall into one the following categories:
Data conversion errors: occurs if a conversion results in loss of significant digits, the loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if the requested conversion is not supported.
Expression evaluation errors: occurs if expressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values.
Lookup errors: occurs if a lookup operation fails to locate a match in the lookup table.
Many data flow components support error outputs, which let you control how the component handles row-level errors in both incoming and outgoing data. You specify how the component behaves when truncation or an error occurs by setting options on individual columns in the input or output.
26) Mention how would you deploy an SSIS package on production?
To deploy SSIS package we need to execute the manifest files and need to determine whether to deploy this into File System or onto SQL Server. Alternatively you can also import package from SSMS from SQL Server or File System.
27)Differentiate between Union All and Merge operations.
Merge transformation can accept only two inputs whereas Union all can take more than two inputs.
Data has to be sorted before Merge Transformation whereas Union all doesn’t have any condition like that.
Data from two paths can be merged into a single path by Merge transformation. This transform is found useful when the data flows through a path that can handle certain errors and merge back it into the main data source.
Prior to merging transformation data should be sorted that can be done by using sorting transformation. For all paths metadata must be the same as the customer id field type cannot be numeric type is one path and character type in another.
Union All transformation works like the Merge transformation, but here in this transformation sorted data is not required. Here the output from multiple sources is taken and then it is transformed and combined to form a single result set.
28) What are Lookup Caches?