Parameter estimation of complex exponential signals corrupted by additive white
Gaussian noise (AWGN) is crucial in the study of distributed beamforming in a practical
scenario. Near zero (0) phase offset are expected at the receiver end which rely on the
smoothing and correction of the frequency and phase estimates. Neither
computational complexity nor the processing latency has an effect on the expected
zero phase offset but the estimation accuracy does. Thus, the maximum likelihood
estimator (MLE) using Fast Fourier Transform (FFT) approach is being considered for
cases with none and post processing in locating of the maximum peaks. Details on how
the phase estimates are arrived at is not always covered in literatures but explained in
the article. Numerical results obtained showed that global maximum peaks are arrived
at by employing a fine search with higher values of FFT.
Replicated linear functional relationship model is often used to describe
relationships between two circular variables where both variables have error terms and
replicate observations are available. We derive the estimate of the rotation parameter
of the model using the maximum likelihood method. The performance of the proposed
method is studied through simulation, and it is found that the biasness of the estimates
is small, thus implying the suitability of the method. Practical application of the
method is illustrated by using a real data set.
In this paper, we study Tsallis' fractional entropy (TFE) in a complex domain by applying the definition of the complex probability functions. We study the upper and lower bounds of TFE based on some special functions. Moreover, applications in complex neural networks (CNNs) are illustrated to recognize the accuracy of CNNs.
A Poisson model typically is assumed for count data, but when there are so many zeroes in the response variable, because of overdispersion, a negative binomial regression is suggested as a count regression instead of Poisson regression. In this paper, a zero-inflated negative binomial regression model with right truncation count data was developed. In this model, we considered a response variable and one or more than one explanatory variables. The estimation of regression
parameters using the maximum likelihood method was discussed and the goodness-of-fit for the regression model was examined. We studied the effects of truncation in terms of parameters estimation, their standard errors and the goodnessof-fit statistics via real data. The results showed a better fit by using a truncated zero-inflated negative binomial regression model when the response variable has many zeros and it was right truncated.
Missing value problem is common when analysing quantitative data. With the rapid growth of computing capabilities, advanced methods in particular those based on maximum likelihood estimation has been suggested to best handle the missing values problem. In this paper, two modern imputing approaches namely expectation-maximization (EM) and expectation-maximization with bootstrapping (EMB) are proposed in this paper for two kinds of linear functional relationship (LFRM) models, namely LFRM1 for full model and LFRM2 for linear functional relationship model when slope parameter is estimated using a nonparametric approach. The performance of EM and EMB are measured using mean absolute error, root-mean-square error and estimated bias. The results of the simulation study suggested that both EM and EMB methods are applicable to the LFRM with EMB algorithm outperforms the standard EM algorithm. Illustration using a practical example and a real data set is provided.
Parameter estimation in Generalized Autoregressive Conditional Heteroscedastic (GARCH) model has received much attention in the literature. Commonly used quasi maximum likelihood estimator (QMLE) may not be suitable if the model is misspecified. Alternatively, we can consider using variance targeting estimator (VTE) as it seems to be a better fit for misspecified initial parameters. This paper extends the application to see how both QMLE and VTE perform under error distribution misspecifications. Data are simulated under two error distribution conditions: one is to have a true normal error distribution and the other is to have a true student-t error distribution with degree of freedom equals to 3. The error distribution assumption that has been selected for this study are: normal distribution, student-t distribution, skewed normal distribution and skewed student-t. In addition, this study also includes the effect of initial parameter specification.
This study developed 0.05° × 0.05° land-only datasets of daily maximum and minimum temperatures in the densely populated Central North region of Egypt (CNE) for the period 1981-2017. Existing coarse-resolution datasets were evaluated to find the best dataset for the study area to use as a base of the new datasets. The Climate Prediction Centre (CPC) global temperature dataset was found to be the best. The CPC data were interpolated to a spatial resolution of 0.05° latitude/longitude using linear interpolation technique considering the flat topography of the study area. The robust kernel density distribution mapping method was used to correct the bias using observations, and WorldClim v.2 temperature climatology was used to adjust the spatial variability in temperature. The validation of CNE datasets using probability density function skill score and hot and cold extremes tail skill scores showed remarkable improvement in replicating the spatial and temporal variability in observed temperature. Because CNE datasets are the best available high-resolution estimate of daily temperatures, they will be beneficial for climatic and hydrological studies.
The location model proposed in the past is a predictive discriminant rule that can classify new observations into one
of two predefined groups based on mixtures of continuous and categorical variables. The ability of location model to
discriminate new observation correctly is highly dependent on the number of multinomial cells created by the number
of categorical variables. This study conducts a preliminary investigation to show the location model that uses maximum
likelihood estimation has high misclassification rate up to 45% on average in dealing with more than six categorical
variables for all 36 data tested. Such model indicated highly incorrect prediction as this model performed badly for
large categorical variables even with large sample size. To alleviate the high rate of misclassification, a new strategy
is embedded in the discriminant rule by introducing nonlinear principal component analysis (NPCA) into the classical
location model (cLM), mainly to handle the large number of categorical variables. This new strategy is investigated
on some simulation and real datasets through the estimation of misclassification rate using leave-one-out method. The
results from numerical investigations manifest the feasibility of the proposed model as the misclassification rate is
dramatically decreased compared to the cLM for all 18 different data settings. A practical application using real dataset
demonstrates a significant improvement and obtains comparable result among the best methods that are compared. The
overall findings reveal that the proposed model extended the applicability range of the location model as previously it
was limited to only six categorical variables to achieve acceptable performance. This study proved that the proposed
model with new discrimination procedure can be used as an alternative to the problems of mixed variables classification,
primarily when facing with large categorical variables.
Whole-genome duplications (WGDs) are widespread and prevalent in vascular plants and frequently coincide with major episodes of global and climatic upheaval, including the mass extinction at the Cretaceous-Tertiary boundary (c. 65 Ma) and during more recent periods of global aridification in the Miocene (c. 10-5 Ma). Here, we explore WGDs in the diverse flowering plant clade Malpighiales. Using transcriptomes and complete genomes from 42 species, we applied a multipronged phylogenomic pipeline to identify, locate, and determine the age of WGDs in Malpighiales using three means of inference: distributions of synonymous substitutions per synonymous site (Ks ) among paralogs, phylogenomic (gene tree) reconciliation, and a likelihood-based gene-count method. We conservatively identify 22 ancient WGDs, widely distributed across Malpighiales subclades. Importantly, these events are clustered around the Eocene-Paleocene transition (c. 54 Ma), during which time the planet was warmer and wetter than any period in the Cenozoic. These results establish that the Eocene Climatic Optimum likely represents a previously unrecognized period of prolific WGDs in plants, and lends further support to the hypothesis that polyploidization promotes adaptation and enhances plant survival during episodes of global change, especially for tropical organisms like Malpighiales, which have tight thermal tolerances.
In this research we introduce an analyzing procedure using the Kullback-Leibler information criteria (KLIC) as a statistical tool to evaluate and compare the predictive abilities of possibly misspecified density forecast models. The main advantage of this statistical tool is that we use the censored likelihood functions to compute the tail minimum of the KLIC, to compare the performance of a density forecast models in the tails. Use of KLIC is practically attractive as well as convenient, given its equivalent of the widely used LR test. We include an illustrative simulation to compare a set of distributions, including symmetric and asymmetric distribution, and a family of GARCH volatility models. Our results on simulated data show that the choice of the conditional distribution appears to be a more dominant factor in determining the adequacy and accuracy (quality) of density forecasts than the choice of volatility model.
Extreme Value Theory (EVT) is a statistical field whose main focus is to investigate extreme phenomena. In EVT, Fréchet distribution is one of the extreme value distributions and it is used to model extreme events. The degree of fit between the model and the observed values was measured by Goodness-of-fit (GOF) test. Several types of GOF tests were also compared. The tests involved were Anderson-Darling (AD), Cramer-von Mises (CVM), Zhang Anderson Darling (ZAD), Zhang Cramer von-Mises (ZCVM) and Ln. The values of parameters μ, σ and ξ were estimated by Maximum Likelihood. The critical values were developed by Monte-Carlo simulation. In power study, the reliability of critical values was determined. Besides, it is of interest to identify which GOF test is superior to the other tests for Fréchet distribution. Thus, the comparisons of rejection rates were observed at different significance levels, as well as different sample sizes, based on several alternative distributions. Overall, given by Maximum Likelihood Estimation of Fréchet distribution, the ZAD and ZCVM tests are the most powerful tests for smaller sample size (ZAD for significance levels 0.05 and 0.1, ZCVM for significance level 0.01) as compared to AD, which is more powerful for larger sample size.
To date, research on the prescribing decisions of physician lacks sound theoretical foundations. In fact, drug prescribing by doctors is a complex phenomenon influenced by various factors. Most of the existing studies in the area of drug prescription explain the process of decision-making by physicians via the exploratory approach rather than theoretical. Therefore, this review is an attempt to suggest a value conceptual model that explains the theoretical linkages existing between marketing efforts, patient and pharmacist and physician decision to prescribe the drugs. The paper follows an inclusive review approach and applies the previous theoretical models of prescribing behaviour to identify the relational factors. More specifically, the report identifies and uses several valuable perspectives such as the 'persuasion theory - elaboration likelihood model', the stimuli-response marketing model', the 'agency theory', the theory of planned behaviour,' and 'social power theory,' in developing an innovative conceptual paradigm. Based on the combination of existing methods and previous models, this paper suggests a new conceptual model of the physician decision-making process. This unique model has the potential for use in further research.
Environmental degradation remains a huge obstacle to sustainable development. Research on the factors that promote or degrade the environment has been extensively conducted. However, one important variable that has conspicuously received very limited attention is energy innovations. To address this gap in the literature, this study investigated the effects of energy innovations on environmental quality in the U.S. for the period 1974 to 2016. We have incorporated GDP and immigration as additional regressors. Three indices comprising of CO2 emissions, ecological footprint and carbon footprint were used to proxy environmental degradation. The cointegration tests established long-run relationships between the variables. Using a maximum likelihood approach with a break, the results showed evidence that energy innovations significantly improve environmental quality while GDP degrades the quality of the environment, and immigration has no significant effect on the environment. Policy implications of the results are discussed in the body of the manuscript.
One of the most important lifetime distributions that is used for modelling and analysing data in clinical, life sciences and engineering is the Weibull distribution. The main objective of this paper was to determine the best estimator for the two-parameter Weibull distribution. The methods under consideration are the frequentist maximum likelihood estimator, least square regression estimator and the Bayesian estimator by using two loss functions, which are squared error and linear exponential. Lindley approximation is used to obtain the Bayes estimates. Comparisons are made through simulation study to determine the performance of these methods. Based on the results obtained from this simulation study the Bayesian approach used in estimating the Weibull parameters under linear exponential loss function is found to be superior as compared to the conventional maximum likelihood and least squared methods.
The cost of parentage assignment precludes its application in many selective breeding programmes and molecular ecology studies, and/or limits the circumstances or number of individuals to which it is applied. Pooling samples from more than one individual, and using appropriate genetic markers and algorithms to determine parental contributions to pools, is one means of reducing the cost of parentage assignment. This paper describes and validates a novel maximum likelihood (ML) parentage-assignment method, that can be used to accurately assign parentage to pooled samples of multiple individuals-previously published ML methods are applicable to samples of single individuals only-using low-density single nucleotide polymorphism (SNP) 'quantitative' (also referred to as 'continuous') genotype data. It is demonstrated with simulated data that, when applied to pools, this 'quantitative maximum likelihood' method assigns parentage with greater accuracy than established maximum likelihood parentage-assignment approaches, which rely on accurate discrete genotype calls; exclusion methods; and estimating parental contributions to pools by solving the weighted least squares problem. Quantitative maximum likelihood can be applied to pools generated using either a 'pooling-for-individual-parentage-assignment' approach, whereby each individual in a pool is tagged or traceable and from a known and mutually exclusive set of possible parents; or a 'pooling-by-phenotype' approach, whereby individuals of the same, or similar, phenotype/s are pooled. Although computationally intensive when applied to large pools, quantitative maximum likelihood has the potential to substantially reduce the cost of parentage assignment, even if applied to pools comprised of few individuals.
The reliability of the electrical distribution system is a contemporary research field due to diverse applications of electricity in everyday life and diverse industries. However a few research papers exist in literature. This paper proposes a methodology for assessing the reliability of 33/11 Kilovolt high-power stations based on average time between failures. The objective of this paper is to find the optimal fit for the failure data via time between failures. We determine the parameter estimation for all components of the station. We also estimate the reliability value of each component and the reliability value of the system as a whole. The best fitting distribution for the time between failures is a three parameter Dagum distribution with a scale parameter [Formula: see text] and shape parameters [Formula: see text] and [Formula: see text]. Our analysis reveals that the reliability value decreased by 38.2% in each 30 days. We believe that the current paper is the first to address this issue and its analysis. Thus, the results obtained in this research reflect its originality. We also suggest the practicality of using these results for power systems for both the maintenance of power systems models and preventive maintenance models.
There are many factors that influence PM(10) concentration in the atmosphere. This paper will look at the PM(10) concentration in relation with the wet season (north east monsoon) and dry season (south west monsoon) in Seberang Perai, Malaysia from the year 2000 to 2004. It is expected that PM(10) will reach the peak during south west monsoon as the weather during this season becomes dry and this study has proved that the highest PM(10) concentrations in 2000 to 2004 were recorded in this monsoon. Two probability distributions using Weibull and lognormal were used to model the PM(10) concentration. The best model used for prediction was selected based on performance indicators. Lognormal distribution represents the data better than Weibull distribution model for 2000, 2001, and 2002. However, for 2003 and 2004, Weibull distribution represents better than the lognormal distribution. The proposed distributions were successfully used for estimation of exceedences and predicting the return periods of the sequence year.
BACKGROUND: The demand in biobanking for the collection and maintenance of biological specimens and personal data from civilians to improve the prevention, diagnosis and treatment of diseases has increased notably. Despite the advancement, certain issues, specifically those related to privacy and data protection, have been critically discussed. The purposes of this study are to assess the willingness of stakeholders to participate in biobanking and to determine its predictors.
METHODS: A survey of 469 respondents from various stakeholder groups in the Klang Valley region of Malaysia was carried out. Based on previous research, a multi-dimensional instrument measuring willingness to participate in biobanking, and its predictors, was constructed and validated. A single step Structural Equation Modelling was performed to analyse the measurements and structural model using the International Business Machines Corporation Software Package for Social Sciences, Analysis of Moment Structures (IBM SPSS Amos) version 20 with a maximum likelihood function.
RESULTS: Malaysian stakeholders in the Klang Valley were found to be cautious of biobanks. Although they perceived the biobanks as moderately beneficial (mean score of 4.65) and were moderately willing to participate in biobanking (mean score of 4.10), they professed moderate concern about data and specimen protection issues (mean score of 4.33). Willingness to participate in biobanking was predominantly determined by four direct predictors: specific application-linked perceptions of their benefits (β = 0.35, p
Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain-computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method.
Johor Bahru with its rapid development where pollution is an issue that needs to be considered because it has contributed to the number of asthma cases in this area. Therefore, the goal of this study is to investigate the behaviour of asthma disease in Johor Bahru by count analysis approach namely; Poisson Integer Generalized Autoregressive Conditional Heteroscedasticity (Poisson-INGARCH) and Negative Binomial INGARCH (NB-INGARCH) with identity and log link function. Intervention analysis was conducted since the outbreak in the asthma data for the period of July 2012 to July 2013. This occurs perhaps due to the extremely bad haze in Johor Bahru from Indonesian fires. The estimation of the parameter will be done by quasi-maximum likelihood estimation. Model assessment was evaluated from the Pearson residuals, cumulative periodogram, the probability integral transform (PIT) histogram, log-likelihood value, Akaike’s Information Criterion (AIC) and Bayesian information criterion (BIC). Our result shows that NB-INGARCH with identity and log link function is adequate in representing the asthma data with uncorrelated Pearson residuals, higher in log likelihood, the PIT exhibits normality yet the lowest AIC and BIC. However, in terms of forecasting accuracy, NB-INGARCH with identity link function performed better with the smaller RMSE (8.54) for the sample data. Therefore, NB-INGARCH with identity link function can be applied as the prediction model for asthma disease in Johor Bahru. Ideally, this outcome can assist the Department of Health in executing counteractive action and early planning to curb asthma diseases in Johor Bahru.