Linear programming is widely used for optimization and applications can be found almost in every industry operating under conflicting constraints. We will here work with a simple and quite common use case of cost optimization problem. The problem can be formulated as a standard linear optimization problem with the objective function is to minimize the transportation cost, subject to supply & demand with equality and inequality constraints.

Let us create some synthetic data. For easy understanding and computational ease, the relevant information is in tabular format as shown below:

Optimization method has a wide application in the industry in many diverse fields such as machine learning, finance, aviation & logistics etc. to name a few. Once we zeroed down on the problem statement, the next step is to solve the problem with the best available options.

To simplify, the idea is to find the best available solution which is at least as good and any other possible solution. If we want to quantify and express the problem in mathematics, we need to come with an objective of solving the problem which is the objective function in mathematics. …

Profitability of stock market trading is directly related to the prediction of trading signals. Here, we will discuss about some basic to advanced and popular technical analysis to build trading signals. Our focus will be on signal generation and visualization. A long list of technical indicators are available covering principal domains such as trend, momentum, volume, volatility, and support and resistance. We will cover a few of these here.

However, once signal is generated, strategy is defined, the next most important task is performance testing which is not the scope of this article.

We will use free crypto currency data as shown below. …

Prediction and classification are important and of great interest because successful prediction of stock prices lead to attractive benefits. However, there is no universal common set of rules but a series of highly complicated and quite difficult tasks are involved for such prediction.

Here, we will show a simple use case to showcase how classification rule can be applied to obtain a trading strategy and conclude with a performance testing of the strategy by running a simple script.

Let us load the data from Quandl.

`BC = BC.loc['2010-01-01':,]`

BC.sort_index(ascending=True, inplace=True)

BC.tail()

Backtesting is an important step to get the statistics to ensure effective trading strategy. It comes with some of the key points such as profit and loss, net profit and loss, invested capital, number of trades/orders return, sharpe ratio etc. Here, we will discuss as how to design a financial trading strategy using open source Python tools and we’ll review the results of the backtest by going through some plots generated by pyfolio.

Let us load the data as shown below.

`dataset = web.DataReader('^IXIC', data_source = 'yahoo', start = '2000-01-01')`

dataset = dataset.sort_index(ascending=True)

# display

print(dataset.head()); print(dataset.shape)

# Plot the adjusted closing prices

dataset['Adj Close'].plot(grid=True, figsize=(10, 6))

plt.title('Nasdaq Composite close price')

plt.ylabel('price …

Error correction model (ECM)is important in time-series analysis to better understand long-run dynamics. ECM can be derived from auto-regressive distributed lag model as long as there is a cointegration relationship between variables. In that context, each equation in the vector auto regressive (VAR) model is an autoregressive distributed lag model; therefore, it can be considered that the vector error correction model (VECM) is a VAR model with cointegration constraints.

Cointegration relations built into the specification so that it restricts the long-run behavior of the endogenous variables to converge to their cointegrating relationships while allowing for short-run adjustment dynamics. …

Feature selection method is a data pre-processing step in conjunction with machine learning for classification or regression purposes. The main motivation for reducing the dimensionality of the data and keeping the number of features as low as possible is to reduce the training time and enhance the classification accuracy of the algorithms we use; moreover, reduced dimensions provide a more robust generalization and a faster response with unseen data. Unlike feature extraction, feature selection does not alter the data.

There are three main groups of feature selection in general: (1) wrapper, (2) embedded and (3) filter methods. Each group has it’s own pros and cons. We will not get into the details of these methods; here we will show how different techniques including mutual information (MI is filter method) can be applied to reduce the dimensionality and still retain 99% variance in the data. Here, advantage is that, with the reduced features, noise in the dataset can be eliminated; model can easily identify the signal from the reduced and relevant dataset and learn from it. …

Vector auto regression (VAR) to first difference generally creates integrated time-series (TS) models. But we may eliminate valuable information about the relationship among variables by differencing, where Vector Error Correction model (VECM) is applicable.

VAR involves multiple exog variables which are important to predict future state of endog variable. Using Granger causality (GC) we can determine the importance of multiple variables and GC is only relevant with TS variables. We will use VAR to investigate GC here.

Here our use case is that, we have data of Western Texas Intermediate, Brent Crude oil and HenryHub Spot price and we shall forecast future 15 time steps of each. We shall use R program to solve this. …

Loans in terms of financial pay outs is an important aspect of banking business system. Several loan applications are scanned based on certain inputs to validate the eligibility for loan. Here our use-case is that, we want to automate the loan eligibility process (real time) based on customer detail obtained during loan application. This will lead to improved service and customer satisfaction.

Let us load the available data to check the information it contain.

ARIMA stands for auto regressive integrated moving averages and popular for time-series prediction. We have a univariate daily time series data and our use case here is to forecast future time steps using the univariate data. The time series is stochastic/ random walk price series. Here, we will discuss basic time series analysis and concepts of stationary or non-stationary time series, and how we can model financial data displaying such behavior.

We will introduce and implement advanced mathematical approaches Autoregressive (AR), Moving Average (MA), Differentiation (D), AutoCorrelation Function (ACF), and Partial Autocorrelation Function (PACF) for dealing with non-stationary time series datasets. We also will introduce seasonality concept in time-series. …

About